What is a sitemap and do I need one?

Does your website have a sitemap? It certainly should do. And if it does, is the sitemap automatically updated?

A sitemap is a list of the pages on your site which search engines can use to discover what you’ve got on the site and how it’s organised. It’s one of the first things any search engine web crawler looks for when it arrives at your site. A sitemap can also offer metadata about the pages you list in that sitemap, such as when the page was last updated, how often it might be changed, and its importance. The sitemap can also give search engines helpful detail about videos and images.

If you’re thinking: “I’ve no idea if my site has one of these”, the chances are – if you use a content management system – that it’s all taken care of. The location of your sitemap is normally held in another file called ‘robots.txt’, which is another critical part of any website. This will be found at the top level of the domain; for example, you’ll find ours at www.bmon.co.uk/robots.txt.

From here, just like the search engine crawlers, you’ll find the location of your sitemap. You can see that ours is at www.bmon.co.uk/sitemap_index.xml.

Take a look at our sitemap, and you’ll see it’s actually a list of smaller sitemaps. This is not uncommon, and indeed, if your site has more than 50,000 pages (the limit for any sitemap), it’s essential. The system we use – a WordPress plugin – breaks up the site’s ‘posts’ into sitemaps of 1000 each. It also creates a separate sitemap for the site’s ‘pages’ (it’s a WordPress thing) and for its post ‘categories’, which helps the search engines understand how the site is set out.

This is not the only way to do things: a list of pages displayed as part of a normal page on the site is another approach, but it’s not the obvious method which the search engines look for. The standard ‘XML’ sitemap that we use can also be created manually, but it’s a real chore, and is only necessary if your website is still a collection of manually-created documents without any management system behind it.

If your sitemap has its location in the robots.txt file, you don’t need to do anything else. However, you can specify its location to Google by using Google Search Console (look for ‘Sitemaps’ in the left-hand menu).

Sitemaps should use your preferred (and final) URL format. For example, a typical entry in our sitemap file will include ‘https’ (because we’ve changed to that); ‘www’ (because we prefer to include that); and a trailing slash (because our system eventually adds that, if not included). Our site automatically sorts things out: if you visit http://bmon.co.uk/about-us, you’re seamlessly sent to https://www.bmon.co.uk/about-us/. However, the sitemap should only have the latter entry in it. If you use canonical tags, the sitemap should list identical page addresses too.

If your site uses a content management system but has no sitemap, it’s probable that you just haven’t switched it on. Do so! If it doesn’t have one because the site is just a few documents you’ve uploaded, think about creating one. Or regular email readers can contact us and we’ll produce a starter one for you. There’s no charge, but you’ll need to maintain it to account for site changes in the future!