Harold Hausman wrote:

> On 10/21/06, Paul Lutus <nospam / nosite.zzz> wrote:
>>
>> Show us the problem. There are many kinds of character sequences that are
>> not allowed in XML data fields, and there are a number of ways to escape
>> the data fields, but they have to be applied in order to work. Arbitrary
>> data can't simply be dropped between XML delimiters, without certain
>> precautions being taken.
>>
>>
>> --
>> Paul Lutus
>> http://www.arachnoid.com
>>
> 
> Hi Paul,
> 
> It sounds like you might have some experience in this area. Not to
> hijack the OP, but could you possibly describe the process you would
> go through if you had a completely random pile of binary barf that you
> wanted to store as an XML attribute?

Okay, you need to know I am famously lazy. In fact, I think Larry Wall was
describing me when he made his well-known remark about programmer laziness
and hubris. Being lazy, the first simple approach I would take is to
enclose the binary data like this:

<enclosing XML tag><![CDATA[(binary data here)]]></enclosing XML tag>

The next step would be to make sure neither the starting or ending CDATA tag
appears in the enclosed binary data, otherwise this strategy will fail.

The next step after that is to escape (and later unescape) the binary data
if needed to assure the uniqueness of the delimiters.

You need to understand that, with a sufficiently large and varied binary
data set, every imaginable character string will appear in the data,
eventually including the delimiters.

This, in turn, means that escaping the data is eventually a requirement, and
escaping the data means it will be larger than if this step were not
needed.

You should realize that another, possibly better, approach for truly large
binary globs is to store them as files, and store links to the files in the
XML data set, rather than the raw data itself.

-- 
Paul Lutus
http://www.arachnoid.com