On Mon, Apr 11, 2005 at 06:21:39AM +0900, Sam Roberts wrote: > You willalso have to take into account invalidly encoded DER, though, > unless you can really take the moral high-ground and refuse to interop > with invalid DER. It's quite common for implementations to neglect the > leading zero necessary to make INTEGER positive if the high bit is set, But then, they are actually sending you a negative value, are they not? What I mean is, there's no ambiguity. If somebody sends you b11111111 then it's -1, not 255, and there's no question about it. It must be a contextual thing to decide that -1 is an invalid value here, and that therefore the sender 'must' have meant 255. > for example. So, when you reencode (correctly) you don't have the same > input. There's a whole set of common errors like this. Ah. OK, I can see the case where you receive b11111111 b11111111 - you could either reject this as an invalid encoding (the BER rules say that it is), or you could decode it as -1, in which case you'll generate a different encoding when you re-encode. I'd prefer to take the view that the encoding is invalid: the standard is absolutely unambiguous. However, if I really needed to interoperate with something so broken, I'd probably define an UNSIGNEDINTEGER type internally. It would encode using the same universal tag as INTEGER, but the value would be treated as unsigned. Hence b11111111 would be 255 and b11111111 b11111111 would be 65535. Propagating invalid encodings in this way should be something of a last resort. I'd be interested to know what the other common errors are that you mention. This is the sort of knowledge which only an experienced implementor will have... > > If there's a possibility that a single attribute will be one of multiple > > types, then it should be wrapped in an ASN.1 'choice' > > ASN.1 choices aren't a "wrapping" in the sense that you see any wrapping > in the BER or DER encoding, not unless you tag, anyhow. When an ASN.1 > choice appears, you literally encode whichever one you want. Yes indeed. What I meant was, if I have foo PrintableString, bar UTF8String, then I can assign @foo = "xxx" and @bar = "yyy", i.e. using native Ruby strings, since when it comes to re-encoding them I'll know what ASN.1 type to use from the ASN.1 definition for each attribute. However if foo were a CHOICE between PrintableString and UTF8String, then this information would be lost. One solution would be to decode as @foo = PrintableString.new("xxx") or @foo = UTF8String.new("xxx") in which case the class of foo carries forward that information. But that makes a new object with an instance variable (say @value) holding the string. Alternatively that information could be recorded in the singleton class of the object: @foo = "xxx" @foo.extend PrintableString That may be cleaner, although this metadata is easily lost: @foo.downcase! # keeps singleton class @foo = @foo.downcase # loses it > This is the > common case for strings, for example. ASN.1 to BER/DER is one-way, there > are numbers of places where you cannot infer the ASN.1 from the > encoding. Not necessarily a criticism, just an observation. Yes, I gathered that. That's why you'd need to carry metadata about the required ASN.1 encodings with the class, or (in some cases, as outlined above) individual values. > > Incidentally, Ruby's ASN.1 library does appear to have a 'traverse' method > > which acts as a stream parser. You still need to build a suitable state > > machine for it to 'yield' each element to, of course. > > Probably built on top of openssl's tree-base routines, so you pay the > memory cose, and the complexity cost. ossl_asn1_decode0 is basically a loop on ASN1_get_object, and as far as I can tell that just walks along an DER stream in memory, updating a start pointer as it goes. So it should work along an object in its linear form, not having expanded to a tree; and with mmap() I guess it could work directly from a file too. It calls itself recursively when it meets a constructed item. $ cat traverse.rb require 'openssl' a = "\xA1\x0A\x43\x08..test.." OpenSSL::ASN1.traverse(a) { |y| p y } $ ruby traverse.rb [0, 0, 2, 10, true, :CONTEXT_SPECIFIC, 1] [1, 2, 2, 8, false, :APPLICATION, 3] $ The parameters to the block seem to be (looking at ext/openssl/ossl_asn1.c): depth start offset header length data length constructed=true (so primitive=false) tag class tag A more friendly API could be a stream of tag_start / data / tag_end method calls on an object, like an REXML stream parser. I don't think the reverse exists, i.e. for taking a stream of these tags and turning them into DER/CER. Shame that none of this appears to be documented! Somebody has taken a lot of time to wrap openssl's ASN.1 parsing for Ruby, but anyone who wants to use it (like me) has to do quite a bit of work to reverse-engineer the API. > Anyhow, mostly I just wanted to say writing a stream-based BER/DER > decoder in ruby would be easy. Writing stream-base DER encoders is > impossible, unfortunately (the ouput size is encoded at the beginning, > they should have used CER more often, but its too late now), but > stream-based BER encoders are also easy. Understood. Once upon a time I wrote a one-pass machine-code assembler that used to rewind to previous points and insert branch offsets once it was able to resolve a label :-) DER makes this a bit more difficult with the variable sized encoding of the length octets, but I think it could be made a two-pass operation. Or you could write out as CER, and then have a two-pass CER to DER convertor (pass one reads in the CER and writes out some auxilliary data about lengths seen; pass two reads the CER again and merges in the length data to create DER) Regards, Brian.