Igal Koshevoy wrote: > Janus Bor wrote: >> No, I didn't try it and it might actually work: Every sequence has a >> size of ~1kb, so 50 000 sequences would probably be around 50mb. But >> getting all this data will take hours, so I need to implement a system >> that will not lose all data if the program is terminated abnormally. >> > Here are some simple alternatives for persisting and retrieving your > data in the order I'd recommend them based on what you've described so far: > > 1. PStore standard library: Put your objects into a magical hash, that's > automatically persisted to a file. Probably the quickest and easiest > solution. See > http://www.ruby-doc.org/stdlib/libdoc/pstore/rdoc/classes/PStore.html PStore writes the whole file at once, not incrementally. Not really what OP is looking for, IMO. > 2. Lightweight SQL database: Maybe store sequences in SQLite as BLOBs. > Probably the best long-term solution, but will require you to work > harder to transform data to and from storage. See > http://sqlite-ruby.rubyforge.org/ Not clear that would be better than files. Maybe so, if the individual strings are short. Would be interesting to get some benchmarks on this question. > 3. Marshall core class: Dump objects to and from strings, and then > files. Useful if you need something more than PStore, but still want to > persist objects directly. See http://ruby-doc.org/core/classes/Marshal.html PStore uses Marshall, so it's odd to say that Marshall is more than PStore. If you're looking for a way to manage marshalled (or string or yaml...) data in multiple files, using file paths as db keys, look no further than: http://raa.ruby-lang.org/project/fsdb/ I think the Set/Hash + many files option is best here, though. -- vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407