On Mon, 23 Oct 2006, Francis Cianfrocca wrote:
> On 10/22/06, snacktime <snacktime / gmail.com> wrote:
>> 
>> I've tested out a couple of ways of storing a queue structure and
>> wondered if others had some better ideas.  I tried a berkeleydb based
>> queue using it's native queue structure, which gives the best
>> performance so far but it's for unix only.  I also have a file based
>> queue where every message is just stored in a sequentially numbered
>> file.  Almost as fast as berkeleydb and works anywhere.  Just for
>
> Kirk Haines was involved in a thread here a couple of months back in which
> he made the case that it's perfectly fast and perfectly scalable to just use
> the filesystem for moderate requirements. His scheme used filenames that
> were actually hashes of the contents, and as I recall he had a mechanism for
> automatically generating a directory hierarchy based on the content hashes
> to keep the inode sizes under control.

Francis remembers right.  I did this for, essentially, a replacement for 
pstore that didn't need to read an index into memory for each transaction, 
so it'd have flat memory usage and speed regarless of the size of number 
of elements in the store.  It works pretty well.

In my case I create a hash of the storage key and use that as part of the 
filename, and then I had the code make subdirectories based on the 
starting letters of the hash in order to avoid having too many files in 
any one directory.  My default was to take a chunk of two characters from 
the beginning of the hash as subdirectory name, twice.

So a hash starting with 'abd45cc' would have it's file in ab/d4/

The speed of the approach is tied very directly to the speed of the IO 
system, but it's relatively fast and very portable so long as your file 
naming scheme is itself portable.


Kirk Haines