I believe this message was rejected for length when I sent it the  
first time, so trying a two-parter...

James Edward Gray II

Begin forwarded message:

> From: James Quick <jimquick / mac.com>
> Date: July 12, 2006 9:51:17 AM CDT
> To: submission / rubyquiz.com
> Subject: Please Forward: Ruby Quiz Submission
>
> I am not on the ruby-talk list, and would appreciate if you could  
> pass this
> along. This was a nice problem to get my feet wet in this language. I
> began looking at Ruby last week after hearing about what an old  
> colleague
> of mine was doing with it. Because I am familiar with several  
> languages
> with which it  shares some features (smalltalk, eiffel, objective- 
> c, perl) it
> is deceptively easy to pick up, and occasionally infuriating. Thank  
> you,
> James, for providing this excellent service to the community.  
> Sometimes
> it is essential to have throwaway problems to guide one's exploration.
>
> My first attempts were very classy (in a bad way) partitioning  
> things into separate
> classes, rigorously defining interfaces and accessor methods. They  
> were also
> very slow, and usually failed to converge on a solution.
>
> Then, by reading source code for a bunch of modules in /usr/lib/ 
> ruby, I found
> that mixing a few methods into the base classes was not only  
> sufficient for
> most of my uses but preferable. Profiling showed that much of my  
> time was
> being spent in each_byte loops so I ripped out the conversions to  
> and from
> Strings and replaced it with a byte comparison factory which stored  
> lambda
> functions. This way, I could pretend that I was producing arrays of  
> byte counts
> from string operations without actually doing them more than once.
>
> This was more than twice as fast as my previous attempts but was still
> not converging very quickly. I had already optimized my target_counts
> array by initializing it with complete information on what was  
> knowable
> about a solution before starting a search. Unfortunately during the  
> search,
> if several hundred thousand loops had occurred, I would do some sanity
> checking which proved to be insane. I converted the target counts  
> into a
> string then back into a count array. If the real result of the  
> sanity check was
> beyond the range of the target and source, I would munge one or more
> values into my target array, thus throwing out all the good work!
>
> It was not until I saw the excellent submissions by Simon Krer that  
> I realized
> that my sanity checking was demoting me to a random search. Optimal  
> random
> Robinizing is more like solving a rubix cube than like searching  
> randomly
> for text. If properly initialized the target and result are  
> mathematically linked.
> If you make any change to the target or source without making a  
> complementary
> change to the other, you will immediately devolve into a random  
> search.
> Doing so is the equivalent of occasionally removing a piece from the
> cube, twisting it, and putting it back. You may have made it  
> impossible
> to solve (unless by chance you randomly undo what you just did).
>
> Basically, the target contains a direct representation of the  
> pangram search
> space in the form of c->n. Each count expresses the assertion that  
> there be
> n occurrences of the character 'c' in the pangram. For a particular  
> value of
> n it may be a completely bogus assertion. All that matters is that  
> the assertion
> is congruent with the information stored in the result_string.
>
> I initialize my target and source with the exact values for  
> constant text
> and their spelled out occurrences. Simon does it far more simply at  
> the
> expense of a few more iterations (but his chainsaw has a much more
> powerful engine!).  He initializes his target (which he calls  
> guess) to
> an array of 1's indicating the invariant assumption that there will be
> 1 occurrence of each letter of the alphabet in the pangram.  His  
> result
> starts with an array of 1's (representing a-z from the itemized
> list n a's, n b's .... and n z's). He then adds the counts for  
> "and", the prefix
> string, and 26 copies of the word "one". The 26 copies of the word  
> "one"
> provide the required link which binds the guess and the "real" result
> together. As soon as he chooses a better value than 1 for a letter
> frequency he will subtract "one" and then add the spelled out version
> of the new choice.
>
> These 26 copies of "one", or the loop of calls to adder lambda  
> functions
> in my version, each perform an equivalent task. They ensure that the
> target and result are rigorously linked.
>
> From here on, despite their quite different implementations, our  
> solutions
> are essentially the same. Random Robinizing proceeds as follows:
> When the target and source differ, randomly choose a new value(n).
> Add or subtract that value from the source (plain integer arithmetic).
> In the destination, decrement the letter counts for each letter in the
> spelled out number (before the change), then increment the letter
> counts for the new spelled out number.
>
> As you can see, the source and destination have completely different
> operations performed on them. They are different but embody the
> essence of the pangram itself. A self referential sentence whose
> implementation (spelling) is identical to its semantic meaning
> (numerical commentary about the sentence).
>
> If you (like me) ran into a performance wall, and found that
> answers were converging slowly or not at all, take a step back.
> As soon as you break the underlying relationship between the
> target and destination, your program will stray into a random walk,
> which may contain long cycles of aimless wandering.
>
> I think there may be some bugs lurking here, though I think most are
> gone. I would appreciate any stylistic or other advice on best  
> practices.
> I alternated several times between studlyCaps and c style naming.
> I also realize that I have know idea about the relative performance
> tradeoffs between cacheing variables and evaluating method calls.
>
> I did see some fairly large performance boosts when variables are
> found first in the outer scope rather then being allocated and lost
> with each iteration.
>
> Finally, I am interested in learning where to look for up to date
> information on the language definition or syntax. I could only find
> something circa 1.6.
>
> The following code makes me realize that I am far from being
> proficient in this language. However, I hope the comments and
> the basic algorithm are clear enough to provide some useful
> insight into the problem domain.
>
> here is some sample output.
>
> As you can see 9 of the 26 letters are already solved before the  
> first pass
> through the loop. This neighborhood quickly converged. In retrospect
> I am now sure that the 'q' from my initials helped. Zebras are useful
> too.
> lili% ruby jqpangram.rb
> "--- 0 Wed Jul 12 10:32:42 EDT 2006"
> [7, 2, 2, 2, 1, 1, 1, 1, 1, 2, 1, 1, 3, 1, 1, 2, 2, 1, 1, 1, 1, 1,  
> 1, 1, 1, 2]
> [7, 2, 2, 2, 24, 2, 2, 2, 2, 2, 1, 1, 3, 24, 28, 2, 2, 5, 12, 10,  
> 1, 2, 8, 1, 1, 2]
> "--- 10000 Wed Jul 12 10:32:47 EDT 2006"
> [7, 2, 2, 2, 28, 5, 3, 7, 8, 2, 1, 2, 3, 16, 17, 2, 2, 12, 31, 25,  
> 3, 7, 12, 3, 4, 2]
> [7, 2, 2, 2, 35, 5, 4, 8, 8, 2, 1, 3, 3, 16, 15, 2, 2, 10, 32, 26,  
> 2, 9, 13, 2, 4, 2]
> "--- 20000 Wed Jul 12 10:32:52 EDT 2006"
> [7, 2, 2, 2, 26, 6, 2, 4, 10, 2, 1, 1, 3, 22, 16, 2, 2, 10, 31, 26,  
> 2, 5, 14, 4, 5, 2]
> [7, 2, 2, 2, 21, 7, 2, 3, 9, 2, 1, 1, 3, 17, 20, 2, 2, 9, 31, 25,  
> 4, 4, 14, 5, 5, 2]
> "--- 30000 Wed Jul 12 10:32:57 EDT 2006"
> [7, 2, 2, 2, 28, 6, 4, 6, 10, 2, 1, 2, 3, 17, 16, 2, 2, 9, 30, 25,  
> 3, 5, 12, 3, 4, 2]
> [7, 2, 2, 2, 27, 6, 3, 6, 10, 2, 1, 2, 3, 16, 15, 2, 2, 10, 32, 24,  
> 3, 6, 12, 4, 4, 2]
> "--- 40000 Wed Jul 12 10:33:01 EDT 2006"
> [7, 2, 2, 2, 29, 8, 4, 8, 9, 2, 1, 3, 3, 15, 17, 2, 2, 12, 30, 23,  
> 4, 6, 12, 2, 4, 2]
> [7, 2, 2, 2, 28, 7, 4, 7, 9, 2, 1, 3, 3, 17, 16, 2, 2, 11, 30, 25,  
> 4, 5, 13, 2, 4, 2]
> "--- 50000 Wed Jul 12 10:33:06 EDT 2006"
> [7, 2, 2, 2, 30, 6, 2, 7, 10, 2, 1, 1, 3, 19, 17, 2, 2, 11, 30, 23,  
> 3, 6, 10, 4, 4, 2]
> [7, 2, 2, 2, 28, 4, 2, 6, 7, 2, 1, 2, 3, 19, 16, 2, 2, 11, 31, 23,  
> 3, 5, 10, 3, 4, 2]
> "--- 60000 Wed Jul 12 10:33:11 EDT 2006"
> [7, 2, 2, 2, 29, 7, 2, 6, 9, 2, 1, 3, 3, 19, 16, 2, 2, 11, 33, 23,  
> 4, 5, 12, 4, 4, 2]
> [7, 2, 2, 2, 31, 6, 2, 6, 9, 2, 1, 3, 3, 20, 16, 2, 2, 12, 31, 23,  
> 4, 6, 12, 3, 4, 2]
> "solution found in 69833 iterations"
> [7, 2, 2, 2, 26, 8, 3, 6, 10, 2, 1, 2, 3, 15, 16, 2, 2, 10, 31, 25,  
> 3, 5, 12, 4, 4, 2]
> [7, 2, 2, 2, 26, 8, 3, 6, 10, 2, 1, 2, 3, 15, 16, 2, 2, 10, 31, 25,  
> 3, 5, 12, 4, 4, 2]
> "mypan----------------------------------------"
> [7, 2, 2, 2, 26, 8, 3, 6, 10, 2, 1, 2, 3, 15, 16, 2, 2, 10, 31, 25,  
> 3, 5, 12, 4, 4, 2]
> "A pangram from jq contains one zebra, seven a's, two b's, two c's,  
> two d's, twenty-six e's, eight f's, three g's, six h's, ten i's,  
> two j's, one k, two l's, three m's, fifteen n's, sixteen o's, two  
> p's, two q's, ten r's, thirty-one s's, twenty-five t's, three u's,  
> five v's, twelve w's, four x's, four y's, and two z's."
> true