On Dec 27, 7:17 am, Esmail <ebonak_de... / hotmail.com> wrote:
> Hi Jordan,
>
> I didn't know about each_with_index until after I posted my last
> message and read more on Ruby .. clearly I have to do more reading,
> but I have found one of the best ways to learn is to do :-)
>
> > There's no built-in way that I'm aware of. You have to iterate over
> > the array yourself. If you want all the indices you could something
> > like...
>
> > indices = []
> > ['aaaa', '>bbbb', '>cccc'].each_with_index { | e, i |
> >   indices << i if e =~ /^>/
> > }
> > p indices # => [1, 2]
>
> > But given the description of what you're trying to do in the other
> > thread, you probably just want to use Array#reject...
>
> > a = ['aaaa', '>bbbb', 'cccc'].reject { | e | e =~ /^>/ }
> > p a # => ["aaaa", "cccc"]
>
> This would delete only the one element, but I am trying to delete a range
> of data (a record). I may have duplicate records, so I am trying to get
> rid of them. They have different identifiers, each starting with a '>'.
> Here's a test file that mimics this:
>
>  >88888/Bla08/the/rest8
> 888888888888888
> 888888888888888
> 888888888888888
> 888888888888888
> 888888888888888
> 88888 -- last line --
>  >77777/Bla07/the/rest7
> 777777777777777
> 777777777777777
> 777777777777777
> 777777777777777
> 777777777777777
> 77777 -- last line --
>  >66666/Bla06/the/rest6
> 666666666666666
> 666666666666666
> 666666666666666
> 666666666666666
> 666666666666666
> 66666 -- last line --
>  >77777/Bla07/the/rest7
> 777777777777777
> 777777777777777
> 777777777777777
> 777777777777777
> 777777777777777
> 77777 -- last line --
>  >
>
> (I add the last > and later remove it)
>
> So, this is what I came up with (with suggestions from you):
>
> ######################################
> # delete duplicate records
> ######################################
> def deleteDuplicates(data, dups)
>
>    dups.each do |name|
>      puts "\n****deleting duplicate \"#{name}\"...\n"
>      s = data.index(name)
>      e = 0
>      data[s+1..-1].each_with_index{ |v, i|
>        if v =~ /^>/
>          e = i
>          break
>        end
>      }
>
>      puts "deleting ... ", data[s..s+e], "..done"
>      data.slice!(s..s+e)
>    end
>
>    data
> end
> ######################################
>
> What do you think? It seems to work, but I'm always interested in
> learning to do things better.
>
> Thanks again!
>
> Esmail

Hi Esmail,

A couple points:

- It's not very efficient to do all that iteration and slicing.

- The regexp won't work since #each and #each_with_index iterate over
lines and not characters (so v == " >...", so /^ >/ would be needed).

- #index returns nil if there is no matching index (error when you get
to s+1 in that case).

How about using Array#uniq, as in:

def no_dups(path)
  IO.read(path).split(" >").uniq.join(" >")
end
fixed = no_dups("testfile")
puts fixed

# =>
 >88888/Bla08/the/rest8
888888888888888
888888888888888
888888888888888
888888888888888
888888888888888
88888 -- last line --
 >77777/Bla07/the/rest7
777777777777777
777777777777777
777777777777777
777777777777777
777777777777777
77777 -- last line --
 >66666/Bla06/the/rest6
666666666666666
666666666666666
666666666666666
666666666666666
666666666666666
66666 -- last line --
 >

Regards,
Jordan