On Mar 24, 2005, at 7:44 AM, Sam Roberts wrote:

> Quoting jamis / 37signals.com, on Thu, Mar 24, 2005 at 02:54:20PM +0900:
>> Syntax is a pure-Ruby framework for doing lexical analysis (and, in
>> particular, syntax highlighting) of text. It currently sports lexers
>> for Ruby, XML, and YAML, and an HTML convertor (for colorizing texts 
>> in
>> those languages to HTML).
>
> Would this be an appropriate tool for parsing ruby to generate ctags?
>

Hmmm, maybe. Not in its current incarnation, though. One thing the 
lexer doesn't give you right now is the location of each token in the 
file. That would be a good addition, though. I'll see about adding that 
to the next version.

> To write a tags file I need to know where I am in ruby's terms (in what
> class, module), what was found (method, attribute, constant, class,
> ...), AND I need to generate a regex that will find this place in the
> file. For repeated names this can mean knowing what the entire line
> looks like, so that I can put leading whitespace into the regex.
>

The lexers that come with Syntax are optimized for syntax highlighting. 
You could conceivably write a different lexer module that was optimized 
for tag extraction, using the Syntax framework. You'd probably do just 
as well to use strscan directly, though.

- Jamis

> Is Syntax something I should be looking at?  It seems there are some
> similarities.. if you know enough to hilight, maybe you know enough to
> generate a ctag?
>
> I'm using rdoc right now, but it is a very large tool, and I would like
> something smaller and more malleable, if possible.
>
> Thanks,
> Sam
>
>> Links:
>>
>>   Download: http://rubyforge.org/frs/?group_id=505
>>   User Manual: http://docs.jamisbuck.org/read/book/4
>>
>> This release is much improved in accuracy and robustness (at least, 
>> for
>> the Ruby lexer--the XML and YAML lexers were not changed). The Ruby
>> lexer now deals better with many ambiguous cases, and even supports
>> multiple heredocs on a single line. It accurately colorizes cgi.rb and
>> mkmf.rb from the standard lib, if that means anything at all to you.
>>
>> The Syntax framework also supports "regions" now (thanks to flgr for
>> the suggestion) and sports many bug fixes (thanks to Carl Drinkwater
>> for discovering most of them). Syntax regions just allow one group to
>> span (and include) multiple groups--like a string that includes
>> interpolated expressions and escape sequences.
>>
>> For a pretty example (mkmf.rb fully syntax highlighted) see
>> http://ruby.jamisbuck.org/mkmf.html.
>>
>> The next release will include robustness fixes for the XML and YAML
>> lexers, as well as a lexer for C. Lexers for Perl, Python, Java, HTML,
>> and RHTML would be nice as well, if I can get to them. Community
>> submissions will be gladly accepted, as long as you are okay with your
>> contributed code being distributed under the BSD license.
>>
>> Enjoy!
>>
>> - Jamis
>>
>>
>