On Jul 20, 2004, at 10:07 AM, Greg Millam wrote:

> Robert Oschler wrote:
>
>> Because of the Google Page Rank land grab, there are web sites running
>> scripts to deface popular Wikis with links to their site. For a 
>> dramatic
>> example look at the Revision history page for the Ruby Garden Wiki:
>> http://www.rubygarden.org/ruby?RecentChanges
>> The problem is, even though, we diligently delete the spam as it 
>> shows up,
>> most Wikis archive the old revisions in a revision list. Google (you) 
>> crawls
>> these revision list pages and finds the deleted spam links. In fact, 
>> you
>> find a lot of them because the spammers keep coming back and we keep
>> deleting them, creating lots of revision history pages that you crawl.
>> Here's a VERY SIMPLE way for you to help out the thousands of Wikis 
>> out
>> there.
>
> robots.txt ?
>
> Google adheres to that very strongly. and I notice there's no 
> http://www.rubygarden.org/robots.txt
>
> http://www.google.com/webmasters/faq.html#norobots

Glancing at the specs, it seems that the benefits of someone posting 
external links could be removed by a combination of wise robots.txt 
settings and a redirect page for external links. Or, one could use the 
meta tags that do the same thing:

<META NAME="ROBOTS" CONTENT="NOFOLLOW">

This should keep any compliant search engine (including Google) from 
analyzing a page for links. Which should prevent the pageranking.

If, however, some external links should be respected, there's the 
redirect trick. External links go to a page which redirects to the 
link. That way, you can allow certain urls (links to rubycentral, 
ruby-lang, etc.) to be read, but links to unknown sites could be 
filtered out, by placing meta tags correctly.

cheers,
Mark