------ art_6855_14540427.1176471724514
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
I think you will be bitten pretty fast with the current approach of
doing only text substitutions while translating HTML into Textile.
For example, HTML is pretty insensitive to some spaces in between:
<h1>main title</h1> <h2>subtitle</h2>
is just as good as
<h1>main title</h1>
<h2>subtitle</h2>
But in Textile,
h1. main title
h2. subtitle
is not the same as
h1. main title h2. subtitle
or even
h1. main title
h2. subtitle
In the current implementation, ClothRed behaves like that:
$ irb
irb(main):001:0> require 'rubygems'
true
irb(main):002:0> require 'clothred'
true
irb(main):003:0> t lothRed.new("<h1>Foo</h1> <h2>Bar</h2>")
"<h1>Foo<h1> <h2>Bar</h2>"
irb(main):004:0> t.to_textile
"h1. Foo h2. Bar"
but that's pretty easy to work around with the simple patch attached.
I just replaced "" as a substitution to HTML </h1>, </h2>, ... by
"\n\n" producing the necessary paragraph breaks for Textile.
"test/test_headings.rb" had to be fixed and I also wrote
"test/test_misc.rb" as a test script for HTML with more than one tag
in it.
The substitution approach will not work quite right for HTML where
closing tags are missing. The algorithm will never understand when the
tags were closed. So this is somewhat limited currently to XHTML which
demands closing tags.
I think the suggestion of using a HTML parser (like Hpricot) to do
this conversion will impose itself pretty soon.
Thanks for the inspiring work.
Adriano Ferreira.
On 4/12/07, Phillip Gawlowski <cmdjackryan / googlemail.com> wrote:
> Or: TTD is fun.
>
> lothRed HTML 2 Textile converter
>
>