Quoting jamis / 37signals.com, on Fri, Mar 25, 2005 at 01:27:37AM +0900:
> On Mar 24, 2005, at 7:44 AM, Sam Roberts wrote:
> 
> >Quoting jamis / 37signals.com, on Thu, Mar 24, 2005 at 02:54:20PM +0900:
> >>Syntax is a pure-Ruby framework for doing lexical analysis (and, in
> >>particular, syntax highlighting) of text. It currently sports lexers
> >>for Ruby, XML, and YAML, and an HTML convertor (for colorizing texts 
> >>in
> >>those languages to HTML).
> >
> >Would this be an appropriate tool for parsing ruby to generate ctags?
> >
> 
> Hmmm, maybe. Not in its current incarnation, though. One thing the 
> lexer doesn't give you right now is the location of each token in the 
> file. That would be a good addition, though. I'll see about adding that 
> to the next version.

I don't need location in file, I just need the text of the line:

  module Foo
    class Bar
      class Bar
    end

The tag would be
  Bar-> regex /  class Bar/
  Bar-> regex /    class Bar/
  Foo.Bar -> regex /  class Bar/
  Foo.Bar.Bar -> regex /    class Bar/

I don't need line no.

For this
  module Foo
  end
  class Foo::Bar
  end

The tags would be different:
  Bar -> /class Foo::Bar/

And for
  class
    Foo
  end

Different again.

Quoting surrender_it / remove-yahoo.it, on Fri, Mar 25, 2005 at 01:49:52AM +0900:
> Sam Roberts ha scritto:
> 
> >I'm using rdoc right now, but it is a very large tool, and I would like
> >something smaller and more malleable, if possible.
> >
> 
> why not ParseTree or ripper ?

I have no idea what ripper does, but parse tree just gives symbols, it
doesn't have enough information for me to build a regex, as above, does
it?

Making tags is an odd problem. It involves semantic analysis, when you
see class Foo, you need to know if it is in module Bar, or inside class
Joe. But, to generate the tag you need access to the original text so
that you can build a regex, which is sensitive to HOW you wrote the
code, not just what the code means. Most tokenizers goal in life is to
abstract you away from the text, so you just see a stream of syntactic
elements.

Rdoc is useful, because it does the analysis, but it also maintains
original text in a way it can (in some cases) be regenerated to form
regexes.

I think its not a bad place to put it, since tags as another output
format is a reasonable extension of its model.

But... it's really slow (i think its how much data it keeps in memory).
It also doesn't quite give me access to everything I want. I can hack
it, but I'm balking at the chore. Adding an output formatter was easy
and standalone. Hacking its internals... thats another story.

I'm totally open to suggestions. I NEED tags to read code effectively.

I'm faster writing in ruby than in C, but I read C code way, way, way
faster due to the tool support I have (vim+tags) (I debug C faster, too,
because I have a great debugger - gdb.) I'm not happy about this
situation.

Maybe I should suggest this as one of those ruby weekly challenges...
Document the tags format, the goals, and let people choose - rules are
that there are no rules, you can use any tool/library you want, even
non-ruby, and let the best code win. If its non-ruby, well, that would
point out an area where ruby could use some work.


Btw, syntax hilighting with rdoc should be easy, it tokenized the input.


Cheers,
Sam