Several readers have contacted me in the last two days asking about a warning email they’ve received from Google, saying that “Googlebot cannot access CSS and JS files” on their website. The words which have made them sit up and take notice, understandably, are “blocking access to these assets can result in suboptimal rankings”. That sounds worrying, so here’s what to do.
Firstly, be aware that the message comes from Google Search Console (formerly Webmaster Tools). You won’t have received the message if you don’t have Google Search Console installed on your site. You may be at risk from the problem, but you just won’t know it. The solution is to get it installed on your site, as I’ve recommended on many occasions.
For those of you who do have Google Search Console installed, I’d have a check for the problem, even if you didn’t get the message. Sign in to Google Search Console, and navigate to messages. If you see the message below, you need to read on! Otherwise, have a good weekend.
OK, what does this message mean?
Well, Google might only crawl a certain amount of your site each time it comes round. So if there are files on your site which you don’t need in the Google index (i.e. scripts, style files and the like), you don’t want Google wasting its time going through them. Fortunately, there’s always been a way of keeping Google where you want it, called the robots.txt file.
With this, you list any parts of your site, or types of file, that you don’t want Google to look at. Being well behaved, Google always looks at robots.txt as soon as it arrives on your site, and does as it’s told. However, there’s a bit of a problem. Nowadays Google is rather sophisticated, and doesn’t just want the text from your pages – it wants to know what the page looks like. It needs to see everything. If you’re blocking things like “CSS files” and “JS files”, it can’t do so. In fact, Google Search Console goes as far as letting you see what Google sees. What you want is for Googlebot to see the same as a real visitor. If you’re blocking certain files, it may not, as in this example:
In the past, steering Google to the real pages on your site (and away from all the script and style files) was a good idea. That’s why millions of people have a robots.txt file, set up with the best of intentions by a website manager, which does just that. But nowadays, this strategy is not what Google would call “optimal”. Google wants to see everything, and it’s best to just let it do so.
Right, let’s see what your robots.txt file looks like.
This is easy enough. You can find it at [your domain]/robots.txt or in Google Search Console:
As you can see, in the robots.txt file for the site shown (which has had the warning message), there are indeed areas of the site which have been blocked off (or ‘disallowed’). And therein lies the problem.
Now, the site above has been created in WordPress, and the robots.txt file may have been created (or be managed) by a WordPress plugin. Your content management system may be in control of your robots.txt too. On most sites however, it’s just a text file which is uploaded manually. So you may be able to edit the file through the content management system, but most likely you’ll have to do it by accessing it directly.
And what should you do, to solve the problem? The general consensus, from people who know a lot more than me, is that little if anything needs to be “disallowed” from the Googlebot now. Unless you know of a good reason to keep them, just delete all those “disallow” lines. If nothing else, delete any which specifically refer to .js or .css files. This may leave you with nothing, but don’t worry, that’s fine. As Google (and other crawlers) do look for a robots.txt file, I like to present one, even if it’s blank.
That’s it. Once your robots.txt file has been modified, check it with the robots.txt Tester tool in Google Search Console. Then go back to the “Fetch As Google” tool, click “Fetch and Render”, and once it’s done its stuff, click the arrow on the right, and you should see two identical screenshots, rather than the differing versions above. You can feel happy, and Googlebot will be happy too.