Hi,

Trying to learn ruby, I am writing a script to migrate from a pybloxsom 
to wordpress. As you may know, pybloxsom stores all entries and comments 
in text files under a directory hierachy. Mi idea is to read all those 
files (the subdirectories store the categories) and inject them in the 
mysql database wordpress uses.

So far, I have been able to read all the posts and comments but I am 
having some problems injecting them in mysql (BTW, I am using the mysql 
module). The problem, I guess, is with some sort of encoding with the text.
Basicaly I have two problems:

- Accented characters. For example, if I have a accented vowel like "νΆ 
they are not properly inserted into the mysql table and would get weird 
characters. I guess that if I do a function that substitute every single 
of these characters for its html entity (ie. í) would work, but I 
guess there must be a more appropriately way to do it, right? Anything 
to do with the encoding?

- Also, I have this problem that wordpress interprets \n characters (I 
guess). For example, if I have a post like the following:
*****************
This is an example of an <img
src="image.jpg"> image.
*****************

would turn into:
*****************
This is an example of an <img <br />
src="image.jpg"> image.
*****************

interpreting the \n character right after <img, inserting the br tag 
which breaks the HTML. I thought that If I would delete all the \n 
characters it would be fine, but the thing is that there are some posts 
with pre labels where \n are required.
Any idea on this?

Anyway, thanks in advance! :-)

-- 
Jesus Roncero
Cheers from England
http://blog.notreally.org