Hello, This is a question likely best suited for ruby-talk (which I forward this email to). Anyway, one way to deduplicate is use Regexp interoplation: NUMBER = /(?:\d+\.\d+)/ LONG_REGEXP = /^#{NUMBER}: I:\d+\s+\(\s+#{NUMBER}.../ On Thu, Jun 21, 2018 at 2:45 PM, Peter Booth <peter_booth / me.com> wrote: > Hello, > > I have some data that I would like to parse with a regex. The raw data > looks like so: > > 9.028: I:4551 ( 0.095 0.096 0.136 ) T:4551 ( 0.095 0.096 0.098 > 0.117 0.136 0.136 ) > 14.066: I:4601 ( 0.095 0.096 5.344 ) T:9152 ( 0.095 0.096 > 0.098 0.119 4.352 5.344 ) > 19.099: I:4609 ( 0.094 0.096 0.132 ) T:13761 ( 0.094 0.096 > 0.098 0.123 4.352 5.344 ) > 24.033: I:4528 ( 0.093 0.095 0.130 ) T:18289 ( 0.094 0.096 > 0.098 0.124 3.344 5.344 ) > > I can extract the data that I want with the following: > > ^(\d+\.\d+): I:\d+\s+\(\s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)\s+\).+\(\ > s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\. > \d+)\s+(\d+\.\d+)\s+\)$ > > I can even name the fields with the longer: > > ^(?<offset_secs>\d+\.\d+): I:\d+\s+\(\s+(?<median_stall>\ > d+\.\d+)\s+(?<p90_stall>\d+\.\d+)\s+(?<max_stall>\d+\.\d+)\ > s+\).+\(\s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.\d+) > \s+(\d+\.\d+)\s+(\d+\.\d+)\s+\)$ > > But that's a long ugly regex that repeats the {capture+whitespace} element > *(\d+\.\d+)\s+* three times, then six times > > Is there a Ruby regex way to simplify/clarify the regex so that it > explicitly shows that the capture+whitespace is repeated three times then > six ? > > > > > thanks, > > Peter > > > > > Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe> > <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core> > > (supressed text/html) Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe> <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>