On Thu, Jul 29, 2010 at 1:43 PM, Junhui Liao <junhui.liao / uclouvain.be> wrote:
> Dear all,
>
> My script tried to read from one original tsv file and distribute into
> new multiple tsv files.
>
> Each line of the original file is like this: time_1, signal_1, time_2,
> signal_2... time_4096, signal_4096.
> I would like to write them into file_1, file_2, ... file_4096
> accordingly, and these files contain time_1, signal_1; time_2, signal_2;
> ... time_4096, signal_4096 separately.
>
> My script did well only if the original file contains ONE line.
> If the original file has two or more lines, the error message like
> following,
>
> new_split.rb:17: undefined method `+' for nil:NilClass (NoMethodError)

> And my script is like this:
>
>  ¨ÂÛÝ

you don't need to declare this, because you later are assigning
directly to @a again

>  ¨Âéôåíîõ´°¹¶
>  ¨Âãïõîôå>  ¨Âìéîåîõ±°

and, by the way, you probably don't need instance variables, probably
local variables could suffice, itemnum looks like a constant and
linenum is not used, so:

ITEM_NUM = 4096
counter = 0

>  ¨Âéìå®ïðå®®¯ïòéçéîáìßäáôá¯ôåóôß²ìéîåó®ôóö¢©®åáãèßìéîå
> do |record|  "^M"
> File.open("../original_data/one_line.tsv").each_line do
> |record|
>  ¨Âòåãïòä®ãèïíð®óðìéô¨¢Üô¢©

a = record.chomp.split("\t") # although maybe fields or line_fields
are better names than a

> @itemnum.times do |n|
>
>  ¨Âéìå®ïðå£ûîýßäåâõçßóðìéô¢«¢®ôóö¢ ¢÷¢©üæ>  puts @counter
>  puts @a[@counter].inspect + "\n"
>  puts @a[@counter+1].inspect + "\n"
>  f << @a[@counter] + "\t" + @a[@counter+1] + "\n"
>  @counter += 2
>end
>  end
>         end

You are adding 2 to the counter every iteration, but not clearing it
after every line. So, on the second line, counter will still be 4096,
and so you will try to get an element from the array that is out of
bounds, returning nil and raising the NoMethodError, because you are
calling the + method on nil. I think you are complicated the issue
with the counting and so on, usually the Ruby iterators are a cleaner
way to traverse lists of things. You can remove the use of
itemnum,counter and so on like this (untested):

File.open("../original_data/test_2lines.tsv").each_line do |record|
  a = record.chomp.split("\t")
  a.each_slice(2).with_index do |(time,signal), index|
    File.open("#{index}_debug_split"+".tsv" , "w") do |f|
      f << "#{time}\t#{signal}\n"
    end
  end
end

Although this will open and close the 4096 files for every line. Are
there many lines? If not, you can read the whole file and build a
structure in memory (a hash of arrays) to store the lines that belong
to every file, and then write them at once to each file.

Jesus.