> have fun putting that together. to do it you need to render, not > just parse, html! It looks pretty easy to me. You'll conveniently put all the noise characters in a different colour. Here's my two-minute solution: $ cat reader.rb src = File.read("test.html") src.gsub!(/<span [^>]*#ccc[^>]*>([^<]*)<\/span>/i) { " " * $1.size } src.gsub!(/ /, ' ') src.gsub!(/<br>/i, "\n") src.gsub!(/<\/?pre[^>]*>/, '') puts src $ ruby reader.rb __ _ / _| | | | |_ ___ ___ |__ __ _ _ __ | _| / _ \ / _ \ | '_ \ / _ | | '__| | | | (_) | | (_) | | |_) | | (_| | | | |_| ___/ \___ |_._ / \__,_| |_| Of course you can keep changing your code, and I can keep changing mine. But someone who took more than two minutes over this could come up with a much more robust solution (e.g. dynamically working out the contrast between foreground and background) Anyway, once your code is deployed on a real live site, by someone other than you, it becomes much harder to change. And the source is going to be available to the attacker too. > now, where i'm heading now, is using css and javascript so to > position the image and characters within the image. Hmm - this risks making the captcha visible by fewer and fewer browsers. OK, so lynx wouldn't be able to view a PNG captcha either; but you risk locking out a lot of mobile devices, set-top boxes and other embedded web browsers (which could otherwise display a PNG quite happily) However, perhaps ASCII-art generation (as a form of unusual and disjointed character set) combined with server-side rendering to a PNG would get around that issue, save you a lot of work in obfuscating the HTML itself, and also be harder to parse. > two other factors in favour of ascii art > > 1) there are tons of ocr programs out there available for free. > there are no ascii art regognition programs that i am aware of. That's not because it's hard - it's because it's been totally pointless, until now that is. If spammers start using ASCII art text, then there's an incentive to make a reader. On the other hand, any E-mail which contains something that looks like ASCII art could probably be classified as spam on that basis alone. ASCII art is, I believe, much more suited to machine reading than a scanned printout. Most importantly, the characters will be on an exact horizontal/vertical grid alignment, not rotated by a few degrees. And also I suspect there will probably only be a handful of legible ASCII art character sets to choose from. Anyway, time will tell. If your captcha isn't widely used, then it may remain strong enough for a reasonable time. (That's apart from the usual attacks on captchas, such as redirecting them to other humans who are in search of porn :-) Regards, Brian.