Tom Sawyer wrote:
> 
> i find myself spending alot of time writing routines to parse complex
> strings. here's a mock example string of my current problem:
> 
> a<b>[c]{d}"e"f {{g}} [[h]]*i**j*"k"
> 
> and i want to parse this into an array like so:
> 
> ['a','<b>','[c]','{d}','"e"','f','{{g}}','[[h]]','*i*','*j*','"k"']
> 
> for clearity here's another example:
> 
> "a"{{b}}c{d}<e>  ->  ['"a"','{{b}}','c','{d}','<e>']
> 
> i have found such strings difficult to parse for a number of reasons.
> first, the text in double-quotations could contain bracket symbols but
> must be treated as plain text. second, [ and { brackets can be nested as
> in g and h of the first example. and finally, because whitespace can
> optionally occur between the parts.

I've seen that you have <, [, {, " and *  Are those the only simbols?
Because in that case, I would use a regexp of the style:

(([<[{"*]*[a-z][>]}"*]*)\s*)+

Then you could see what groups are only made of empty spaces (the ones
you want to ignore) and wich are made of the elements that you want to
insert to the list with something like
if !re.match("\s", element_to_test):
	Add_To_List()

As you know how many groups there are after parsing (parser.start() and
parser.end() ) you can loop over all the finde patterns.

That's how I would do it, but I'm only a newbie...

Regards,

Guille