For the longest time, Google (news, site) has expressed disdain against black-hat search engine optimization methods. These include keyword-stuffing, redirects and all sorts of underhanded tactics. Recently, though, the spotlight has been pointed toward so-called content farms, which are essentially websites with large collections of articles that are supposedly of low-quality, and designed specifically to monetize on-page advertisements.

Crowdsourcing Quality Control

Google has traditionally been against search engine spam, and has regularly fine-tuned its algorithm to prevent spammers from ranking high in SERPs. The mushrooming of content farms has probably resulted in headaches for Google engineers, given community clamor against low-quality, mass-produced content littering the top search results. The massive amount of fresh content and traffic on content farms makes them a formidable force, though. For instance, Demand Studios supposedly generates about 5,700 new articles per day. AOL produces 1,700 daily, and Yahoo! Associated Content, 1,500.

As "pure webspam" has decreased over time, attention has shifted instead to "content farms," which are sites with shallow or low-quality content. In 2010, we launched two major algorithmic changes focused on low-quality sites. Nonetheless, we hear the feedback from the web loud and clear: people are asking for even stronger action on content farms and sites that consist primarily of spammy or low-quality content."

The search giant seems to have handed the responsibility to the user by coming up with a Chrome browser extension called the Personal Blocklist. The extension lets a user identify domains to exclude from search results. This is essentially a crowd-sourced initiative to curb content farming, as the lists of blocked sites are also sent back to Google, and are used to fine-tune the search algorithm, considering the websites that users find to be spammy.

We've been exploring different algorithms to detect content farms, which are sites with shallow or low-quality content. One of the signals we're exploring is explicit feedback from users. To that end, today we're launching an early, experimental Chrome extension so people can block sites from their web search results. If installed, the extension also sends blocked site information to Google, and we will study the resulting feedback and explore using it as a potential ranking signal for our search results."

What Constitutes Content Farming, Anyway?

Now, the questions that everyone seems to be asking are how to define a content farm, and why they are bad for the Web. The online communities don't seem to be in agreement as to these two questions. Proponents of Google's move to minimize content farms' role in search results argue that these are spammy because of the low quality and lack of originality in the articles.