--------------010001050705000106020009
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Florian Gross wrote:
> Here's my solution. It builds a tree of the Gedcom nodes.
And here's mine. It attempts to minimize memory usage by writing the XML
as soon as possible for each node. (This was necessary because on one of
my tests--a 7 meg GEDCOM file--it rapidly exhausted almost all of my 1G
of RAM when I used REXML.) Other than that, it is nothing special. I was
swayed by Hans' argument that attributes are most appropriate for
metadata, so I only use them for id and ref.
- Jamis
--
Jamis Buck
jgb3 / email.byu.edu
http://www.jamisbuck.org/jamis
--------------010001050705000106020009
Content-Type: text/plain;
name ed2xml.rb"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename ed2xml.rb"
#!/usr/bin/env ruby
class GED2XML
IS_ID ^@.*@$/
class Node < Struct.new( :level, :tag, :data, :refid )
def initialize( line l )
level, tag, data ine.chomp.split( /\s+/, 3 )
level evel.to_i
tag, refid, data ata, tag, nil if tag IS_ID
super level, tag.downcase, data, refid
end
end
def indent( level )
print " " * ( level + 1 )
end
def safe( text )
text.
gsub( /&/, "&" ).
gsub( /</, "<" ).
gsub( />/, ">" ).
gsub( /"/, """ )
end
def process( io )
node_stack ]
puts "<gedcom>"
wrote_newline rue
io.each_line do |line|
next if line /^\s*$/o
node ode.new( line )
while !node_stack.empty? && node_stack.last.level > ode.level
prev ode_stack.pop
indent prev.level if wrote_newline
print "</#{prev.tag}>\n"
wrote_newline rue
end
indent node.level if wrote_newline
print "<#{node.tag}"
print " id #{node.refid}\"" if node.refid
if node.data
if node.data IS_ID
print " ref #{node.data}\">"
else
print ">#{safe(node.data)}"
end
wrote_newline alse
else
puts ">"
wrote_newline rue
end
node_stack << node
end
until node_stack.empty?
prev ode_stack.pop
indent prev.level if wrote_newline
print "</#{prev.tag}>\n"
wrote_newline rue
end
puts "</gedcom>"
end
end
GED2XML.new.process ARGF
--------------010001050705000106020009--