On Jan 14, 2006, at 5:14 PM, Jeff Cohen wrote:

> I'm trying to parse ruby files to find all the class definitions in  
> the
> file. For each line in the file, I thought I could use the  
> following to
> pull out the class name:
>
> \bclass\b(\w+)\b
>
> so then $1 would give me the class name.
>
> But it doesn't work:
>
> irb(main):001:0> line = "class Article < MyBaseClass"
> => "class Article < ActiveRecord::Base"
> irb(main):002:0> line =~ /\bclass\b(\w+)\b/
> => nil
>
> I think I narrowed down the problem to my use of \w, but I can't
> understand why.
>
> For extra credit, anybody know how I can make sure I can ignore  
> comments
> and quoted strings?  I want to make sure I ignore these things:
>
> if option_exists # handle class options
>
> as well as
>
> puts "Your are in a class by yourself"
>
> But those are advanced... if I can just get the first one working I'll
> be grateful!
>
> Thanks,
> Jeff
>
> -- 
> Posted via http://www.ruby-forum.com/.
>

your \w is right. \b doesn't work the way you think it does though.  
It doesn't consume anything, ie;

"<-- \b is just before the 'c'
c
l
a
s
s__ \b is in between the 's' and the space
    <- space doesn't match \w
A
r
t
i
c
l
e
.
.
.

So what you really want is
line =~ /\bclass\s+(\w+)/
irb(main):007:0> line =~ /\bclass\s+(\w+)/
=> 0
irb(main):008:0> $1
=> "Article"


As for the other questions, comments aren't SO hard:

/#.*$/ unless of course you want to handle strings, then you have to  
worry about # inside of strings. I'm not even going to begin to try  
to create a regex to match quoted strings, thats all sorts of  
difficult especially with heredocs and such. I would take a look at  
rdoc and see if you can't manipulate it to get a list of classes for  
you.