I believe this message was rejected for length when I sent it the first time, so trying a two-parter... James Edward Gray II Begin forwarded message: > From: James Quick <jimquick / mac.com> > Date: July 12, 2006 9:51:17 AM CDT > To: submission / rubyquiz.com > Subject: Please Forward: Ruby Quiz Submission > > I am not on the ruby-talk list, and would appreciate if you could > pass this > along. This was a nice problem to get my feet wet in this language. I > began looking at Ruby last week after hearing about what an old > colleague > of mine was doing with it. Because I am familiar with several > languages > with which it shares some features (smalltalk, eiffel, objective- > c, perl) it > is deceptively easy to pick up, and occasionally infuriating. Thank > you, > James, for providing this excellent service to the community. > Sometimes > it is essential to have throwaway problems to guide one's exploration. > > My first attempts were very classy (in a bad way) partitioning > things into separate > classes, rigorously defining interfaces and accessor methods. They > were also > very slow, and usually failed to converge on a solution. > > Then, by reading source code for a bunch of modules in /usr/lib/ > ruby, I found > that mixing a few methods into the base classes was not only > sufficient for > most of my uses but preferable. Profiling showed that much of my > time was > being spent in each_byte loops so I ripped out the conversions to > and from > Strings and replaced it with a byte comparison factory which stored > lambda > functions. This way, I could pretend that I was producing arrays of > byte counts > from string operations without actually doing them more than once. > > This was more than twice as fast as my previous attempts but was still > not converging very quickly. I had already optimized my target_counts > array by initializing it with complete information on what was > knowable > about a solution before starting a search. Unfortunately during the > search, > if several hundred thousand loops had occurred, I would do some sanity > checking which proved to be insane. I converted the target counts > into a > string then back into a count array. If the real result of the > sanity check was > beyond the range of the target and source, I would munge one or more > values into my target array, thus throwing out all the good work! > > It was not until I saw the excellent submissions by Simon Krer that > I realized > that my sanity checking was demoting me to a random search. Optimal > random > Robinizing is more like solving a rubix cube than like searching > randomly > for text. If properly initialized the target and result are > mathematically linked. > If you make any change to the target or source without making a > complementary > change to the other, you will immediately devolve into a random > search. > Doing so is the equivalent of occasionally removing a piece from the > cube, twisting it, and putting it back. You may have made it > impossible > to solve (unless by chance you randomly undo what you just did). > > Basically, the target contains a direct representation of the > pangram search > space in the form of c->n. Each count expresses the assertion that > there be > n occurrences of the character 'c' in the pangram. For a particular > value of > n it may be a completely bogus assertion. All that matters is that > the assertion > is congruent with the information stored in the result_string. > > I initialize my target and source with the exact values for > constant text > and their spelled out occurrences. Simon does it far more simply at > the > expense of a few more iterations (but his chainsaw has a much more > powerful engine!). He initializes his target (which he calls > guess) to > an array of 1's indicating the invariant assumption that there will be > 1 occurrence of each letter of the alphabet in the pangram. His > result > starts with an array of 1's (representing a-z from the itemized > list n a's, n b's .... and n z's). He then adds the counts for > "and", the prefix > string, and 26 copies of the word "one". The 26 copies of the word > "one" > provide the required link which binds the guess and the "real" result > together. As soon as he chooses a better value than 1 for a letter > frequency he will subtract "one" and then add the spelled out version > of the new choice. > > These 26 copies of "one", or the loop of calls to adder lambda > functions > in my version, each perform an equivalent task. They ensure that the > target and result are rigorously linked. > > From here on, despite their quite different implementations, our > solutions > are essentially the same. Random Robinizing proceeds as follows: > When the target and source differ, randomly choose a new value(n). > Add or subtract that value from the source (plain integer arithmetic). > In the destination, decrement the letter counts for each letter in the > spelled out number (before the change), then increment the letter > counts for the new spelled out number. > > As you can see, the source and destination have completely different > operations performed on them. They are different but embody the > essence of the pangram itself. A self referential sentence whose > implementation (spelling) is identical to its semantic meaning > (numerical commentary about the sentence). > > If you (like me) ran into a performance wall, and found that > answers were converging slowly or not at all, take a step back. > As soon as you break the underlying relationship between the > target and destination, your program will stray into a random walk, > which may contain long cycles of aimless wandering. > > I think there may be some bugs lurking here, though I think most are > gone. I would appreciate any stylistic or other advice on best > practices. > I alternated several times between studlyCaps and c style naming. > I also realize that I have know idea about the relative performance > tradeoffs between cacheing variables and evaluating method calls. > > I did see some fairly large performance boosts when variables are > found first in the outer scope rather then being allocated and lost > with each iteration. > > Finally, I am interested in learning where to look for up to date > information on the language definition or syntax. I could only find > something circa 1.6. > > The following code makes me realize that I am far from being > proficient in this language. However, I hope the comments and > the basic algorithm are clear enough to provide some useful > insight into the problem domain. > > here is some sample output. > > As you can see 9 of the 26 letters are already solved before the > first pass > through the loop. This neighborhood quickly converged. In retrospect > I am now sure that the 'q' from my initials helped. Zebras are useful > too. > lili% ruby jqpangram.rb > "--- 0 Wed Jul 12 10:32:42 EDT 2006" > [7, 2, 2, 2, 1, 1, 1, 1, 1, 2, 1, 1, 3, 1, 1, 2, 2, 1, 1, 1, 1, 1, > 1, 1, 1, 2] > [7, 2, 2, 2, 24, 2, 2, 2, 2, 2, 1, 1, 3, 24, 28, 2, 2, 5, 12, 10, > 1, 2, 8, 1, 1, 2] > "--- 10000 Wed Jul 12 10:32:47 EDT 2006" > [7, 2, 2, 2, 28, 5, 3, 7, 8, 2, 1, 2, 3, 16, 17, 2, 2, 12, 31, 25, > 3, 7, 12, 3, 4, 2] > [7, 2, 2, 2, 35, 5, 4, 8, 8, 2, 1, 3, 3, 16, 15, 2, 2, 10, 32, 26, > 2, 9, 13, 2, 4, 2] > "--- 20000 Wed Jul 12 10:32:52 EDT 2006" > [7, 2, 2, 2, 26, 6, 2, 4, 10, 2, 1, 1, 3, 22, 16, 2, 2, 10, 31, 26, > 2, 5, 14, 4, 5, 2] > [7, 2, 2, 2, 21, 7, 2, 3, 9, 2, 1, 1, 3, 17, 20, 2, 2, 9, 31, 25, > 4, 4, 14, 5, 5, 2] > "--- 30000 Wed Jul 12 10:32:57 EDT 2006" > [7, 2, 2, 2, 28, 6, 4, 6, 10, 2, 1, 2, 3, 17, 16, 2, 2, 9, 30, 25, > 3, 5, 12, 3, 4, 2] > [7, 2, 2, 2, 27, 6, 3, 6, 10, 2, 1, 2, 3, 16, 15, 2, 2, 10, 32, 24, > 3, 6, 12, 4, 4, 2] > "--- 40000 Wed Jul 12 10:33:01 EDT 2006" > [7, 2, 2, 2, 29, 8, 4, 8, 9, 2, 1, 3, 3, 15, 17, 2, 2, 12, 30, 23, > 4, 6, 12, 2, 4, 2] > [7, 2, 2, 2, 28, 7, 4, 7, 9, 2, 1, 3, 3, 17, 16, 2, 2, 11, 30, 25, > 4, 5, 13, 2, 4, 2] > "--- 50000 Wed Jul 12 10:33:06 EDT 2006" > [7, 2, 2, 2, 30, 6, 2, 7, 10, 2, 1, 1, 3, 19, 17, 2, 2, 11, 30, 23, > 3, 6, 10, 4, 4, 2] > [7, 2, 2, 2, 28, 4, 2, 6, 7, 2, 1, 2, 3, 19, 16, 2, 2, 11, 31, 23, > 3, 5, 10, 3, 4, 2] > "--- 60000 Wed Jul 12 10:33:11 EDT 2006" > [7, 2, 2, 2, 29, 7, 2, 6, 9, 2, 1, 3, 3, 19, 16, 2, 2, 11, 33, 23, > 4, 5, 12, 4, 4, 2] > [7, 2, 2, 2, 31, 6, 2, 6, 9, 2, 1, 3, 3, 20, 16, 2, 2, 12, 31, 23, > 4, 6, 12, 3, 4, 2] > "solution found in 69833 iterations" > [7, 2, 2, 2, 26, 8, 3, 6, 10, 2, 1, 2, 3, 15, 16, 2, 2, 10, 31, 25, > 3, 5, 12, 4, 4, 2] > [7, 2, 2, 2, 26, 8, 3, 6, 10, 2, 1, 2, 3, 15, 16, 2, 2, 10, 31, 25, > 3, 5, 12, 4, 4, 2] > "mypan----------------------------------------" > [7, 2, 2, 2, 26, 8, 3, 6, 10, 2, 1, 2, 3, 15, 16, 2, 2, 10, 31, 25, > 3, 5, 12, 4, 4, 2] > "A pangram from jq contains one zebra, seven a's, two b's, two c's, > two d's, twenty-six e's, eight f's, three g's, six h's, ten i's, > two j's, one k, two l's, three m's, fifteen n's, sixteen o's, two > p's, two q's, ten r's, thirty-one s's, twenty-five t's, three u's, > five v's, twelve w's, four x's, four y's, and two z's." > true