I've got a bibliographic "database" with about 3200 entries that I
recently moved from my legacy Mac to Linux and in the process converted
it to a homebrew XML format. The XML needed some tweaking, some of
which I did by hand and another part of which I did with XSLT (at the
same time as learning XSLT). As it happens, somewhere along the line I
made a mistake and lost the <author>s and <editor>s of a lot of
entries. I noticed the mistake only after more changes to the converted
"database" and after deleting the intermediary files. I do still have
the first XML version of the file which contains all of the missing
information.
Enter Ruby and XPath. I'd like to fill in missing authors and editors
in the current version from the original one.
Read in both versions of the file
For each "dubious" entry in the current file
Find a corresponding (based on title and year) entry
in the original file
Fix the current entry
Dump a fixed file
I think, identifying dubious entries as well as finding corresponding
entries could best be done with XPath expressions. I've seen that REXML
does support XPath, but the docs didn't give me much of an idea how to
use this particular functionality.
Michael
--
Michael Schuerig GPG Fingerprint
mailto:schuerig / acm.org DA28 7DEB 5856 3365 BED9
http://www.schuerig.de/michael/ 8365 0A30 545A 82D2 05D7