Quoting B.Candler / pobox.com, on Sat, Apr 09, 2005 at 05:28:12PM +0900:
> On Fri, Apr 08, 2005 at 05:24:41PM +0900, GOTOU Yuuzou wrote:
> > If this premise is right and to_der method is modified to
> > refer user defined default tag, inheriting primitive types
> > may make it easy. A conceptual usage is as follows:
> > 
> >   # define new class to overrides ISO64Strings's tag number
> >   # and tag class.
> >   class X690Date < OpenSSL::ASN1::ISO64String
> >     DEFAULT_TAG = 3  
> >     DEFAULT_TAG_CLASS = :APPLICATION
> >   end
> >   X690Date.new("19710917", 1, :EXPLICIT, :CONTEXT_SPECIFIC).to_der
> > 
> > any ideas?
> 
> This looks good. But the obvious next thing to do is to record that
> information in a parser table, so that ASN1.decode will create an instance
> of X690Date instead of an ASN1Data object.
> 
> If we go down that route, then what I'd really want is bindings between
> arbitary Ruby classes and ASN1 types (including set/sequence/choice), so
> that a tree of objects can be converted to and from der. Attached is one
> idea how this might look.


I'd suggest not doing that. Its a common desire, but it leads to
incredible problems down the road.

Problems:
- String has multiple representations in ASN.1. To encode it, you need
  to choose the String type. Particularly for DER, this causes
  round-trip problems - you decode a TeletexString to ruby String, then
  reencode, it gets encoded as UTF8String, and now you have mangled the
  data. In particular, cryptographic signatures fail.

- Memory overhead goes through the roof, because ASN.1 is very verbose.
  This is a variation of what happened in the XML world. XML looks like
  a tree, so people write tree-based APIs. Fast and easy... then they
  get a large document, or try and figure out why their code is so slow,
  and end up having to change to SAX, or some other stream-based API.

I've had direct experience maintaining and writing BER and DER codecs
for PKI/cryptographic protocols.  We had a Java one written be people
who believed that DER was actually "distinguished". They were wrong.
When you get a certificate signed by by a major CA, and it has an extra
insignificant zero in an INTEGER, and you decode, then reencode
(correctly!) before verifying the signature, and the verification fails,
but MS IE6.0 verifies the signature fine, guess who fixes the problem?
Hint: not the CA, and not MS (and even if they do, you still have to
interop with legacy data floating around).

Anyhow, not to say that this won't work in special cases, just like it
does in XML. It is possible to set up mappings between classes and
ASN.1, but I'd suggest not requiring an entire in-memory tree of your
input to be built, and if you are using DER, you must be very careful
to preserve the original encoding, and there are multiple
representations of things like strings and dates/times, even in DER.
Your mapping has to take this into account.

Don't get off on the wrong foot by assuming ASN.1 works as advertised.

Anyhow, DER and BER are such simple formats (at the bit-level, not in
the way they are used in protocols), you might be better off just
writing your own codec in ruby. Its basically just a TAG/LENGTH/VALUE
encoding, implementing it might actually be faster than figuring out how
to use OpenSSL, though admittedly I say that as somebody who has read 3
or 4 implementations, and wrote a few generations, so maybe it just
seems easy to me.

Btw, its this very low-level simplicity that suckers folks into thinking
its easy to wrap in high-level OO APIs. I think this is similar to XML -
just tagged data, with some params, how complicated can XML be?  :-) If
you do take this approach, I'd spend some time thinking about how you
would do it in XML, the problems that arise, and the API patterns
developed to work-around the problems.

And if this sounds like an incomprehensible rant, and is of no use to
you at all, sorry!

Cheers,
Sam

Btw, DJBs name is a bad word in the mail community, and I'm deeply
suspicious of anybody who suggests that somehow ASN.1 is "simpler" than
the IETF text-based protocols. I've implemented both, and its not true.

On the plus side, binary protocols almost force the writing of proper
decoders, rather than letting the innocent thing that using scanf() to
decode mail headers is a workable idea. On the other hand, I see just as
much brain-damaged protocol complexity in ASN.1 protocols as I do in
IETF mail protocols. And it is really nice to be able to see your data
on the wire without doing hex dumps. Thinking mail is easier with XML or
ASN.1 misses the point - mail isn't hard because of the bits on the wire
- its hard because it is a globally distributed system used in lots of
ways, for many purposes.

DJB has a tendency to radical over-simplification, and then to abusing
people who want to do things his proposals don't allow. Example would be
his proposal to outlaw accented characters in "internationalized" domain
names. It may be more secure, but its not that internationalized when
the turks and french lose a few of their vowels from the allowed
characters in domain names!

Anyhow, have fun, implementing protocols is usually lots of that, its
pretty cool to pull up the hood and see how things really work!


> My actual target is to encode and decode ASN1 protocol messages, such as:
> http://homepages.tesco.net./~J.deBoynePollard/Proposals/IM2000/Architecture/msoap.asn1
> 
> Regards,
> 
> Brian.