Encouraging Google to index our pages

It’d be a surprise if Google ever indexed every page on our websites, and it’s even less likely that every page will be included in the results. But we can take steps to increase our coverage.

To ensure the important pages are all crawled, make sure they have short paths from the home page and are featured prominently in the site map. To really get technical, some people even analyse what pages Google visits using their server logs.

Conversely, if there are pages that needn’t be indexed (perhaps because they’re low-content e-commerce or data pages), they should be kept off the sitemap. The ‘noindex’ tag and even use of robots.txt can be helpful here. Don’t be tempted to have ‘orphan’ (unlinked) pages though, as these can lead to search engines developing a confused map of the site.

If possible, get a site crawl which will show canonical tags (the address we’re telling Google is the master copy of a page). Some content management systems can make a mess of these.

And finally, an odd (but seemingly genuine) suggestion that I’ve read is to avoid hinting that any page might be in ‘soft 404‘ status; apparently, this can even include having the digits 404 in the page URL! Search Console will list soft 404 errors in the site’s Index Coverage report.