On 12.04.2007 18:31, MenTaLguY wrote:
> On Thu, 12 Apr 2007 16:10:06 +0900, Robert Klemme <shortcutter / googlemail.com> wrote:
>>> Every NFA can be rewritten as a DFA.  Each state in the resulting DFA
>>> corresponds to a superposition of states in the original NFA
>> While this is true in theory there is a number of modern RX features
>> that are extremely difficult or impossible to do with a DFA.  So as it
>> stands now for pratical purposes most modern RX engines are NFA or
>> hybrids,
> 
> It's important not to confuse NFAs with backtracking.  For instance, Thompson's algorithm uses NFAs, but it evaluates by rewriting them on-the-fly as DFAs, not by performing a backtracking search.  Again, NFAs and DFAs are equivalent, even if the naive algorithm (backtracking) for evaluating an NFA is less efficient than the naive algorithm for evaluating a DFA.
> 
> On the other hand, those modern features which really do require backtracking (surprisingly few) also exceed the capabilities of finite automata (i.e. those features correspond to non-regular languages), so at that point the regular expression engines cannot be said to be using FAs at all, but a more general type of automaton.

Strinctly speaking yes.  But AFAIK the basis is usually a NFA.

Btw, because of the nature of the engine there are some more differences 
between DFA and NFA, for example typically order of expressions in an 
alternative matters with a NFA engine.  I prefer that often, because it 
allows to control the way the RX engine matches to a certain extent, 
i.e. you can prioritize alternatives which can be pretty handy at times.

> Again, though we must obviously keep backtracking in reserve for those patterns that need it, it is very wasteful not to use a pure FA approach for those patterns where non-regular features aren't being used.
> 
>> sometimes (Perl) with heavy optimizations to prevent certain
>> ugly effects (see "Mastering Regular Expressions" for details).
> 
> "Mastering Regular Expressions" is very good for learning the pragmatics of working with contemporary regular expression implementations, but it is not a good source for theory.

Since I had the theory at university and at the moment I'm more 
interested in using them than writing them this is good enough for me. :-)

Kind regards

	robert