On 8/24/06, Peter Bailey <pbailey / bna.com> wrote:
> Thanks, Jan. My data originally came from a mainframe ASCII export, so,
> it looks like a 2D table, delineated with spacebands. I've tweaked it
> now so it's just using tabs, and I've even imported it into a
> spreadsheet. I guess it's much more like a CSV format now than a YAML
> format.
>
> I have one question to your response. You say that I could store the
> data in a hash keyed by a filename, which I have done in the past, but,
> then you say that the rest of the columns, of which there are many,
> could be the key's value. How can you have multiple entries for a key
> value? A hash is only a "2-column" entity, isn't it, one key, one value?

right.

> Do I just make all the cells in a row one value, with a comma or
> something between them as a way to distinguish each cell? In other
> words, use the has to match the incoming filename, then, make the row
> that it's in an array and take it from there? (I'm thinking out loud
> here.)

exactly. I ws thinking of transforming

filename1, data11, data12, data13
filename2, data21, data22, data23
filename3, data31, data32, data33

into
{
      filename1 => [ data11, data12, data13 ],
      filename2 => [ data21, data22, data23 ],
      filename3 => [ data31, data32, data33 ]
}
(quotes omitted)

which in YAML looks like:

filename1:
  - data11
  - data12
  - data13
filename2:
  - data21

otherwise your yaml would look like (array of arrays:)
-
  - filename1
  - data11
  - data12
  - data13
-
  - filename2
  - data21

etc...

The problem here is when you need to lookup by some dataXY... that's slower...
NB: Those filenames have to be unique obviously.