I want to write my own wiki markup language. Pure regexp fails me, as  
I need a proper parser to keep track of state.
I thought I'd give Syntax a try, but I'm a little confused as to some  
of the specifics.

1) What is a 'region', and how do I use the start_region method? It's  
not documented in the API, or the source. (I think this is what I  
want for nesting tags.)

2) Do I have to close_group and close_region, or do they  
automatically get invoked under certain circumstances? (Does starting  
one group close the previous one? Do repeated calls to open the same  
group cause them to be aggregated together (is that how accumulating  
text in :normal groups works?)

3) How do I keep track of state during successive calls to #step? I  
tried an instance variable, but that doesn't seem to exist across calls.

Following is my terrible, broken attempt at the basics of what I'm  
after. Am I totally misunderstanding how to use Syntax?


require 'rubygems'
require_gem 'syntax'

class OWLScribble < Syntax::Tokenizer
   def step
         if heading = scan( /^={1,6}/ )
             start_region "heading level #{heading.length}".intern
             $heading_end = Regexp.new( heading + "\\s*" )
         elsif $heading_end && ( heading = scan( $heading_end ) )
             end_region "heading level #{heading.length}".intern
             $heading_end = nil
         elsif char = scan( /^[\r\n]/ )
             start_group :paragraph, char
         elsif scan( /\*\*/ )
             if $inbold
                 end_region :bold
                 $inbold = nil
             else
                 start_region :bold
                 $inbold = true
             end
         elsif char = scan( /./ )
             start_group :normal, char
         else
             scan( /[\r\n]/ )
         end
   end
end

Syntax::SYNTAX[ 'owlscribble' ] = OWLScribble

str = <<END
Intro paragraph

= Heading 1 =
First **paragraph** under the heading.

== Second **Heading** = very yes ==
Another paragraph.
END

tokenizer = Syntax.load( "owlscribble" )
tokenizer.tokenize( str ) do |token|
   puts "#{token.group} (#{token.instruction}) #{token}"
end



--
(-, /\ \/ / /\/