25 billion web spam pages every day

Maintaining a search engine seems to be as much about keeping stuff out as putting stuff in order. In  a recent post called Why keeping spam out of Search is so important, Google says: “If you’ve ever gone into your spam folder in Gmail, that’s akin to what Search results would be like without our spam detection capabilities.” It’s a great article, and well worth the read.

In the company’s latest annual ‘Webspam report’, it’s revealed that Google’s webspam team discovers 25 billion pages every day which have to be filtered out of the index. Google received nearly 230,000 reports of search spam in 2019, and took action on 82% of them. That’s impressive, but only the tiniest fraction of what the automated detection must be doing.

The company says that progress has been made in fighting sites with auto-generated content and copied content. These sites typically also include elements such as fake buttons, large numbers of ads, suspicious redirects and malware. Google was able to reduce the impact of this type of spam by more than 60% in 2019.

There are almost certainly websites out there which copy or otherwise use your original content. Fortunately, if the search engines continue to keep up, few people should ever find them.