How to help the search engines to help you

How many pages have you got on your website, and is there a straightforward path to all of them? You probably don’t know, but if you don’t, how do you expect Google to find out? I’ve studied web “crawlers” or “spiders” like Google’s, trying to map out everything on a site just by following links, and it can be a much harder process than you think. Even compiling a list of different pages can be challenging. For one thing, it’s quite likely that your site might refer to the same page with several different page addresses (“URLs”). It’s not hard to see how pages on your site can get missed out, or only infrequently found. It’s up to you to help the search engines, and the way this is done is through a “sitemap” behind the scenes.

A sitemap is nothing more than a list of pages on your website. It should automatically update itself constantly. If your website has been built responsibly, it will already have one of these, and every time you add or remove pages from the site, the sitemap should be invisibly amended accordingly. However, not all website developers do their jobs properly, and there are millions of websites out there without sitemaps.

Do you have one? Take a look. If your site has a sitemap, it will probably be called “sitemap.xml” and will be at the top level, e.g www.[whatever].com/sitemap.xml. This file will either be the sitemap itself, or an index of sitemaps, but either way, you should be able to click through and see if the sitemap looks complete and up to date.

If there’s no sitemap, don’t despair (yet!), because it might be at another location. When this is the case, the location is set in the site’s “robots.txt” file, e.g www.[whatever].com/robots.txt. If you don’t have a sitemap.xml file or a robots.txt file, you really need to get someone looking at fixing both of those right away. If you don’t have a sitemap.xml file and your robots.txt file doesn’t mention a sitemap either, then you probably don’t have a sitemap, and again, you really should get this fixed.

As an example, on our site, here’s the robots.txt file, pointing the search engines to the sitemap:

Screenshot 2014-01-16 13.53.06

…and here’s the sitemap, which is an index…

Screenshot 2014-01-16 13.58.15

…which leads to the real sitemaps, like this:

Screenshot 2014-01-16 13.59.46

If you can’t find an XML sitemap on your website, check if this is really the case with your website developer. If there isn’t a sitemap, get them to install one (and make sure it’s guaranteed to be automatically self-updating).

Leave a Reply

Your email address will not be published. Required fields are marked *