* Dan Zwell (dzwell / gmail.com) wrote:

>  It also makes me happy that someone else is working on a converter
>  from infix to postfix. I just (almost) finished implementing
>  Dijkstra's Shunting Yard algorithm
>  (http://en.wikipedia.org/wiki/Shunting_yard_algorithm) in Ruby, as
>  part of a bigger project.  If you want to look at it, it's at
>  http://paste-bin.com/11526, but it's 227 lines of poorly commented
>  meaty algorithm. It currently works on mathematical equations with
>  variables. It doesn't do any evaluation, though--just parses to an
>  array.

Your implementation is very similar to mine (though I'm parsing boolean
search queries, ala Google) -- mine is a bit simpler:

    InputPriority = Hash.new(1000)
    InputPriority[:and]   = 800
    InputPriority[:or]    = 500
    InputPriority[:not]   = 900
    InputPriority[nil]    = 0
    InputPriority[:open]  = 1000
    InputPriority[:close] = 1
    StackPriority = InputPriority.dup
    StackPriority[:open]  = 1
    StackPriority[:close] = 1000

    def rpnise(itokens)
      stack = []
      instructions = []

      itokens.each do |itoken|
        if InputPriority[itoken] > StackPriority[stack.last]
          stack.push(itoken)
        elsif InputPriority[itoken] <= StackPriority[stack.last]
          while StackPriority[stack.last] >= InputPriority[itoken]
            instructions << stack.pop
          end
          stack.push(itoken)
        end
      end
      instructions.concat(stack.reverse)
    end

After this I sanitize the token stream and use it to build a tree of
BooleanAnd/Or/Not/SubStringQuery objects to represent the search (which
then rewrite themselves in a vague attempt at optimization and sorting).

Demo:
  # Token stream
  irb(main):009:0> toks = analyser.simplify(tokenizer.tokenize("foo AND bar OR (wibble OR wobble NOT foo)"))
  [#<NSearch::SubStringQuery:0x18a4a20 @substr="foo">,
   :and,
   #<NSearch::SubStringQuery:0x18a49f8 @substr="bar">,
   :or,
   :open,
   #<NSearch::SubStringQuery:0x18a49a8 @substr="wibble">,
   :or,
   #<NSearch::SubStringQuery:0x18a4958 @substr="wobble">,
   :and,
   :not,
   #<NSearch::SubStringQuery:0x18a4908 @substr="foo">,
   :close]

   irb(main):010:0> analyser.rpnise(toks)
   [#<NSearch::SubStringQuery:0x188a670 @substr="foo">,
    #<NSearch::SubStringQuery:0x188a648 @substr="bar">,
    :and,
    #<NSearch::SubStringQuery:0x188a5f8 @substr="wibble">,
    #<NSearch::SubStringQuery:0x188a5a8 @substr="wobble">,
    #<NSearch::SubStringQuery:0x188a558 @substr="foo">,
    :not,
    :and,
    :or,
    :or]

You can then follow the instructions, popping each SubStringQuery onto a
stack and popping them off on :and, :or and :not to build a tree of
query objects:

#to_s
  OR(OR(AND(NOT("foo"),"wobble"),"wibble"),AND("bar","foo"))

#to_sql
  (((NOT (Title LIKE "%foo%") AND Title LIKE "%wobble%") OR Title LIKE
    "%wibble%") OR (Title LIKE "%bar%" AND Title LIKE "%foo%"))


I'd recommend against using Ruby's own Generator, btw; it uses callcc
and has a tendancy towards being slow and leaky.  If you just need a
#next/#peek method, just turn it into an array and shove it in an
appropriate object which tracks the position.

-- 
Thomas 'Freaky' Hurst
    http://hur.st/