Rabu, 09 Februari 2011

Google's "Great Spam" Quest

How to Stop Spam

How to Stop Spam
Google is working on ways to rid its search results of "content farms"—sites that create many pages of very cheap content crafted to appear high up in Google's results. Speaking this week at Farsight 2011, a one-day event in San Francisco on the future of search, the firm's principal search engineer, Matt Cutts, said that Google is considering tweaks to the algorithms that guide its search results. Cutts announced last week that Google's algorithms had been altered to penalize sites that copied content from other sites as a way of climbing higher in search rankings. 

Startup search company Blekko uses a different approach: yesterday it announced that it had excluded 20 "spam" sites from its index entirely, based on which pages its users had marked as spam when they appeared in search results. The 20 sites include many often described as content farms, including Demand Media's eHow site. 

Harry Shum, who leads development on Microsoft's search engine, Bing, also appeared at the event and agreed that search companies need new approaches. Google came into the search world with a "we can't be spammed" battle cry and introduced the search engine optimization (SEO) world to PageRank. 

PageRank (an eigenvector centrality measure, to be precise) gives web pages a high score if they receive links from many other pages, but does so in a way that the credit received for a link is higher if it comes from a page that is already highly ranked. 

PageRank is "keyword independent," which means it allows Google to calculate the latter parts of the ranking score offline. As a result, since day one of Google on the web, it hasn't been unusual for end users (and more so SEOs) to find a highly ranked page in the search results, even when it's obviously irrelevant to the search topic. 

Pages that are important in some context -- yet not in the context of the specific search query. Co-citation can skew results. It's beyond the scope of this article to answer why, but lists can seemingly force Google to provide a combined on-topic-off-topic results page. 

Matt Cutts from Google, Harry Shum from Bing, and Rich Skrenta from Blekko spoke on a panel today at the Farsight Summit. Much of the conversation was around the Bing/Google results copying ordeal, but part of the conversation was about search quality in general, and the impact content farms are having on it. 

Blekko announced this morning that it has banned eHow and other content farms from its results.
 Cutts said that when Google finds spam with its manual team, it also ejects it from Adsense, and that people tend to put the blame on AdSense, but even if that disappeared, we'd still have spam. 

When asked what incentive Google would have to remove content from AdSense-driven pages that drive billions of dollars for the company, he just said that Google has always taken the philosophy that they care more about the long-time loyalty of users. 
By. How to Stop Spam

Tidak ada komentar:

Posting Komentar