Hello,

This is a question likely best suited for ruby-talk (which I forward this
email to).

Anyway, one way to deduplicate is use Regexp interoplation:

NUMBER = /(?:\d+\.\d+)/
LONG_REGEXP = /^#{NUMBER}: I:\d+\s+\(\s+#{NUMBER}.../

On Thu, Jun 21, 2018 at 2:45 PM, Peter Booth <peter_booth / me.com> wrote:

> Hello,
>
> I have some data that I would like to parse with a regex. The raw data
> looks like so:
>
> 9.028: I:4551 (   0.095   0.096   0.136 ) T:4551 (   0.095   0.096   0.098
>   0.117   0.136   0.136 )
> 14.066: I:4601 (   0.095   0.096   5.344 ) T:9152 (   0.095   0.096
> 0.098   0.119   4.352   5.344 )
> 19.099: I:4609 (   0.094   0.096   0.132 ) T:13761 (   0.094   0.096
> 0.098   0.123   4.352   5.344 )
> 24.033: I:4528 (   0.093   0.095   0.130 ) T:18289 (   0.094   0.096
> 0.098   0.124   3.344   5.344 )
>
> I can extract the data that I want with the following:
>
> ^(\d+\.\d+): I:\d+\s+\(\s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)\s+\).+\(\
> s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.
> \d+)\s+(\d+\.\d+)\s+\)$
>
> I can even name the fields with the longer:
>
> ^(?<offset_secs>\d+\.\d+): I:\d+\s+\(\s+(?<median_stall>\
> d+\.\d+)\s+(?<p90_stall>\d+\.\d+)\s+(?<max_stall>\d+\.\d+)\
> s+\).+\(\s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)
> \s+(\d+\.\d+)\s+(\d+\.\d+)\s+\)$
>
> But that's a long ugly regex that repeats the {capture+whitespace} element
> *(\d+\.\d+)\s+*  three times, then six times
>
>  Is there a Ruby regex way to simplify/clarify the regex so that it
> explicitly shows that the capture+whitespace is repeated three times then
> six ?
>
>
>
>
> thanks,
>
> Peter
>
>
>
>
> Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
> <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>
>
>
(supressed text/html)
Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>