benjohn / fysh.org wrote:

>> benjohn / fysh.org wrote:
>>
>> / ...
>>
>>> thanks for the reply. I know I can do this, but it means that the
>>> substitution ("\\1\\2\\3") has to be aware of the composition of the
>>> regular expression.
>>
>> Yes, that is true for all regular expressions.
>>
>>> The Regexp is no longer a neat little machine that
>>> only grabs things to replace. It's now grabbing the packaging around
>>> the
>>> thing to replace too, so you've got to be aware of this in writing the
>>> substitution.
>>
>> Yes, but this cannot be avoided. You have two choices for examined text
>> that
>> surrounds the area to be modified -- you can capture it while examining
>> it,
>> and use the captured text in the replacement, or you can use
>> non-capturing
>> references:
>>
>> (?=non-captured text)
> 
> I think this may be what I should use. Also, the sugestion of using word
> edge tokens works for the specific case.
> 
>>
>> But the two alternatives work much the same way -- they examine text
>> that is
>> preserved as part of the overall regular expression. All that changes
>> is /how/ the text is preserved.
>>
>> So, to move ahead, please post a specific example of what you need. Post
>> an
>> example of the original string and the desired replacement.
> 
> :) Well, I have a solution for the specific case. That's not what I'm
> getting at though. I'm trying to find out if regexp allow me to do
> something more general. I want to do this (sorry, I don't have a ruby to
> hand):
> 
> class CodeFragment
>   attr_accessor :code_fragment
> 
>   def variables_regexp
>     /\b[xyz]\b/
>   end
> 
>   def utilised_variables
>     code_fragment.scan(variables_regexp).uniq.sort
>   end
> 
>   def output_substitution(substitutes)
>     code_fragment.gsub(variables_regexp) do |v|
>       substitutes[v[0]]
>     end
>   end
> end
> 
> cf = CodeFragment.new
> cf.code_fragment = "sin(x+y)"
> puts cf.output_substitution({'x'=>1, 'y'=>2})
> 
> should give "sin(1+2)"
> 
> What I want is for the thing that provides the regular expression to not
> need to know about the function that is using it; and for the functions
> that uses the regular expression to not know about the expression
> provided.
> 
>> regular
>> expression. It /is/ possible to take a first step by posting an example
>> of
>> original text, and replacement text. Maybe we should try that.
> 
> Thank you for your help here.
> 
> I'm not trying to solve a single problem though, I'm trying to
> understant what kinds of problem I can solve.
> 
> I want something that acts as an abstract machine for finding things in
> a string (in this case variables, but the rules could be more complex).
> One should be able to use this machine without knowing what it finds, or
> how it finds. All I should need to know is that it finds things. I'm
> trying to understand if regexps are able to do this - to provide this
> separation. Perhaps they don't, which is fine. I'd just like to know if
> they do or not, or if they do a bit, how much.

Again, your prose description is not precise enough for a reader to know
exactly what you want, which is why we have such things as computer
languages and mathematics. But one can offer educated guesses.

Here is a function that doesn't know in advance what will be sought, it
simply and blindly carries out a certain kind of filtering based on
caller-provided strings:

def get_text_between_tags(data,tag)
   return data.scan(%r{<#{tag}>(.*?)</#{tag}>})
end

If I call this function with a set of HTML data in "data" (containing an
HTML page) and a tag string like "td", this function will return an array
containing the text between each pair of <td> ... </td> tags in the data
string.

Note that this function will accept any data string whatsoever, and it will
also accept any search tag whatsoever.

Is this what you mean? Can you extrapolate this way of approaching the
problem to solve your own?

-- 
Paul Lutus
http://www.arachnoid.com