On 10/26/07, Ruby Quiz <james / grayproductions.net> wrote: > The task is to implement data structures for efficiently editing text. One > common data structure is a gap buffer: > > http://en.wikipedia.org/wiki/Gap_buffer > > Other options to consider include: ropes (see quiz #137), linked lists, simple > strings, or a multi-level combination of data structures (i.e. for lines vs. > characters in a line). There are many data structures that may work efficiently > with simple editing operations, but not all of those data structures will work > well for more complex functionality. Since I haven't seen any solutions yet, I'd thought I'd elaborate on some possible solutions I was thinking of: * conventional gap buffer implemented with a string. The string would have 3 parts (a couple indices might be used for the markers): before the cursor, the "gap" (containing garbage), and after the cursor. When you have no more space in the gap, copy the data to a new string using a larger gap. Think about how the gap should be sized. I believe most text editors use a gap buffer. * gap buffer implemented with two strings. Represent the text before the cursor with one string and the text after the cursor with another string that is reversed. All operations around the cursor correspond to operations at the ends of these strings - O(1). Unlike the conventional gap buffer, you don't have to deal with reallocation (gap filing up). Instead let ruby do its string reallocation when a string's capacity fills up. The memory efficiency might not be as good as the conventional gap buffer, though. * gap buffer with deferred gap movement. The "gap" is really only useful when making edits (insertions/deletions). For simple cursor movements, you don't have to "move" the gap immediately. The cursor position could be changed independently of the gap. But, you'll have to synchronize the gap to the cursor position once an edit is needed. Random access cursor movement can also be done in O(1) time, but you'll still have to pay for it if you immediately make an edit at that new position. * use a rope implementation (quiz #137). You could start with the inefficient string solution I gave earlier and operate on a rope instead (making new ones if using an immutable implementation). It may be difficult to make the solution as fast as the gap buffers for simple stuff (likely slower by O(log(n))), but ropes will probably be more powerful for complex functionality. I believe ropes were invented for text editing (something called Coral I believe). * linked lists. The text could be represented as a linked list of characters. The cursor position might correspond to one of the linked list nodes or links. All operations around the cursor should be O(1). An advantage over gap buffers is that you can efficiently implement multiple cursors (possibly from different editing panes or even different users). The downside is the memory usage is higher and random access might be higher. * use a two level data structure. One data structure could be used for lines and another for the characters within each line. All of the above data structures apply to both. For managing lines, you'd need to use arrays instead of strings in the data structure since the element is a line instead of a character. Using a two-level data structure will give better performance for line operations (i.e. up/down). Enjoy! Eric