George George wrote: > Hello, > > Given a string, > "gcgggcgggccgcggaaattacta" it may so happen that the gcgg substring is > very important and it is repeated in strings of this nature. > > I would like to replace all occurrences of "gcgg" with a new symbol call > it B such that I can reduce the initial text to BggcgggccBaaattacta. > > While i know it is possible to accomplish that using regular expressions > and substitution,my wider aim is to develop a grammar for such a > string,using a finite alphabet for example {A,G,C,T} in this case. Where > A,G,C,T are terminal symbols and B is a nonterminal symbol. Ideally I > would like to have a set of Production rules for generating such > strings. > > How easy or relevant is it to use ruby for deriving stochastic grammars > and linguistics? Are there existing libraries for doing such things in > ruby? > > I am new to the world of grammar and stuff so please bear with me > > Thank you. > > George Well, I don't know a whole lot about stochastic grammars in Ruby, but assuming your choice of terminal symbols was not wholly coincidental, you may find BioRuby [1] helpful. If nucleotides really were just an arbitrary example and you're actually looking for something broader, then, someone else here may have a suggestion. [1] BioRuby: http://bioruby.org/ -- Posted via http://www.ruby-forum.com/.