Quoteing deliriousREMOVEUPPERCASETEXTTOREPLY / atchoo.be, on Thu, May 01, 2003 at 01:55:07AM +0900:
> I hope you will indulge this slightly (I hope) off-topic design question
> since it is not specific to ruby design.

I've run into this issue a bunch, in C and again in Ruby, when building
decoders for things like X.509 certificates, and vCards. It's
surprisingly hard to think through, I found. What I've done (and I'm not
claiming its the best way!), is massage my assumptions about the
requirements so that I solve just the problem I actually have, not all
the problems I can imagine might be possible to solve.

I hope this will pertinent, so I'll describe what i did in various
circumstances. It's a little rambly, but I have lots of thoughts, and
no time to edit.

** X.509 certs - I made them immutable, essentially. When you decode,
you do a DecodeBegin(), get a decode context, and it is read-only. This
makes sense, because:

 - you can't change anything in a cert without invalidating the
   signature!
 - its rare to want to make a new cert from an old one, I.e, to change
   it a little and resign it

Creating a cert is a different process, you do a EncodeBegin(), and
start adding the things you want in the cert, and when you're done,
call End(), and it gives you the binary DER encoding for the cert.

We began the design process with the idea that we would just have
a Certificate object, and it would be mutable, and you could create
one from the binary encoding, then change it, then encode it again.
This turned out to be hard, in lots of ways, so we rethought it,
and realized the model was not-right, it sounded nice and OOy, and
academic, but really, certificate encoding and decoding are very seperate
processes, not intermixed (usually) in the code. CAs encode. Everybody
else just decodes! The objects should be different (Dave's comment
about it being the same object only if the data AND the operations
are identical applies here).

** XML - I didn't write this, but look at REXML (I don't know it well,
so hopefully I'm not wrong about this!). What REXML appears to do is
decode XML into an object hierarchy. You can then change anything (?) in
the hierarchy (add elements, remove, change them). Then you reencode
from the doc element route, and it traverses the hierarchy, reencoding
everything.

Imagine if you had to do element.get_mutable_element() to get a mutable
element from the tree? It would be hugely painful! And you actually want
to modify some things inplace, not get a mutable version of a whole
document, or a sub-tree of the document.

The approach of REXML is that you can't make a bad XML element, they are
all valid XML elements, so the containing document doesn't need any
control over the parts, the parts are all valid XML.

---> This is one approach for your Contact. Abandon, your idea of having
the AddressBook enforce some kind of conditions. Any condition you name,
I think I could argue that its too limiting! Your example of no
duplicate names certainly is. Looked at another way, why should one
Contact be coupled to ANY other contact? There may be things that a
Contact will not allow (such as setting the name to nil, for example),
but if its a good contact, why would there be some state in the
AddressBook that causes my Contact to not be a valid member of it? Why
even disallow duplicates? What's a duplicate? There can be two Tom Jones
living in the same house, with the same phone number!

In REXML's case, imagine that a REXML document was validating, that it
had a DTD, and wouldn't allow you to add an element of a particular tag
in a place that the DTD didn't allow. Ouch! That's a hard problem, and
it doesn't try to solve it. I think it would be pretty hard. Every
modification to an element would have to check the DTD to make sure its
a valid modification.

** vCard - I recently wrote a vCard decoder/encoder. I wanted a single
object (I didn't want to split encoding and decoding), because a vCard
is a contact, and people change contacts. I also wanted to, as much
as possible, preserve the original encoding. This is because reencoding
wire formats is a BAD idea, it usually leads to the telephone game,
where one person whispers in someones ear, who whispers in another,
and what comes out isn't what goes in. In theory, if everybody
does it perfectly, it works, but assuming perfection is a fast route to
failure. The PKI and email worlds are full of bugs and interoperability
problems caused by reencoding. Anhow, that means that I didn't want
to just decode to a hierarchy of objects, allow them to be changed,
and then during reencoding walk the tree and reencode everything. I only
want to reencode pieces that are new, or have been changed.

Background: a vCard is decoded internally into and array of objects,
where the object contains a hash of paramaters, where each paramater has
an array of values (a email address can have a type=home,internet,pref,
for example).

card
  original String
  lines[]
     Line { params{ 'name' => [ value1, value2, ..] }, other things...

I wanted people to be able to change a vCard. That meant changing
anything, and if they changed a Line, the Card needed to know, so that
it would know it had to reencode itself, otherwise it would leave the
original encoding unchanged. I implemened the params, for example, as a
hash mapping String to an Array. How would the Card know that a Line
that it had returned as a result of search was changed? How would the
Line know that a piece of its parmas Hash had changed? And some of the
params effect the contents of a line - if you add a param that say the
enoding is base-64, the line's value has to be changed to base-64.

Approach 1:

  Don't return Ruby base types like a Hash of String=>Array, write my
  own types, that all could have Observers, and that would notify there
  containing objects when they were changed. I could (and would) have
  done this, and it would have worked fine. But it would have been a lot
  of work, and I'm using Ruby to do less work, not more! I don't want to
  build wrappers for Array and Hash!

--> Another post mentioned a variant of this idea for a Contact, where
a Contact is some kind of Facade, and all requests made to it are
forwarded to the AddressBook, which has enough information to say
whether it is valid.

Approach 2:

  Change my assumptions. I made a Line immutable. You can add a Line,
  you can delete a Line, you can find a Line, you can create a new Line,
  but you can't modify a Line. If you want to change a Line, you have to
  make a new Line and add it to the Card, and delete the old Line from
  the card. Now creation happens in one way, and during the creation of a
  new Line I can apply all the self-consistency tests (encoding
  specified once, if the encoding is base-64, encode the value as
  base-64, etc, etc.). Adding also happens in one place, and I can check
  vCard consistency there (no adding of a Line saying BEGIN:vCard into
  the middle of a vCard!).

  So, my implementation is simpler, faster to write, and thus less buggy
  (I hope), and easier to maintain. Is it as amzing as it could have
  been? No, but I don't have 3 months to work on amazing... And its
  pretty easy to do the thing that need to be done.


--> A variant of this approach for you: Have a AddressBook.find that returns
a Contact that is a duplicate of the Contact it has internally. Allow the
Contact to be changed, but since this is a stand-alone object, it
doesn't affect the AddressBook's contact entry. Nothing is saved until
you do a AddressBook.save(contact). The AddressBook could then do
any validation it wanted to in the save, rather than every single change
you make to a Contact needing to be validated.




I think variants of this problem show up all over the place, and have a
lot of different solutions and approaches, its really interesting to
hear people talking about it!

Sam