Thursday, February 08, 2007

Google’s -950 Penalty

Tedster from Webmasterworld.com has posted a very interesting theory about the -950 penalty and a direct correlation with a seemingly coincidental patent that was recently released. I agree with his logic about this particular patent having a unique effect of the rankings.

However, if you read through this patent it seems to me that it was designed to detect and penalize content generated from basically a content generation system. The sole purpose of these content generators is to mix-up content and make it seem themed and relevant for search engines.

The most interesting example of this is the following “An example of the cluster bit vectors are as follows, using the above phrases: TABLE-US-00001 Monica purse Cluster Bill Clinton President Lewinsky designer ID Bill Clinton 1 1 1 0 14 President 1 1 0 0 12 Monica 1 0 1 1 11 Lewinsky purse 0 0 1 1 3 designer

The first thing I noticed reading this line is that the data looks like it may belong together, but the order written obviously would be generated and not logical English. Google could take this sample and run it against other known examples of content that are themed. If this particular sample is so far beyond the normal threshold it could easily trigger such a filter. The most interesting part of spam is usually it is targeted to mostly competitive subjects, thus the sample size is very large and easy to target.

We know Google has the ability to globally pattern match snippets of content and deem them duplicative, thus a filter of this nature could be pretty easy to build on top of such technology. It’s rather ingenious!

Labels: ,

1 Comments:

Blogger Richard said...

It occured on one of my sites for 2 weeks then left without doing anything. strange

http://www.netwriting.co.uk/2007/07/13/overcoming-google-950-penalty/

12:18 PM  

Post a Comment

<< Home