This one’s a bit technical

If any of you are really interested in today’s subject, perhaps you should go and get a job in internet consultancy rather than industrial marketing. But stick with it, because there’s a lesson to learn in fairly plain-English at the end.

The hot discussion topic in the world of search engine optimisation at the moment is Latent Dirichlet Allocation and “topic modeling”. Stop glazing over already, some of us have to understand all this stuff. The main article to read is Latent Dirichlet Allocation (LDA) and Google’s Rankings are Remarkably Well Correlated on SEOmoz if you want to know more. The work attempts to explain how search engines can come up with relevant pages even when the search term is ambiguous, or if the pages would be good results but don’t have the search term prominently within them. It’s terrific stuff.

As I understand it, if this concept does indeed influence how the search engines work, then we need to ensure that our pages not only contain the actual words of their key search terms, but also a lot of other words which the search engines understand as being related to those terms. Now, if you have a fairly extensive single-topic page, this is going to be a natural consequence of that page, so it would make sense for the search engines to use this technique.

As the article says: “If we want to rank well for ‘the rolling stones’ it’s probably a really good idea to use words like ‘Mick Jagger,’ ‘Keith Richards,’ and ‘tour dates.’ It’s also probably not super smart to use words like ‘rubies,’ ’emeralds,’ ‘gemstones,’ or the phrase ‘gathers no moss,’ as these might confuse search engines (and visitors) as to the topic we’re covering.”

That’s a lot to consider when fine-tuning a page, but I suspect it could well be worth the effort.


  1. Sue Malleson

    Fasctinating! Thanks Chris for flagging this up. What I think is really interesting is that it would seem that – just as in the dark black-hat SEO days of the late 1990s – the best policy is honesty, not strategic manipulation. If content is rich, to the point, full of relevant words and phrases, the chance are you’ll do OK because it will fit the LDA theory.

    I’ve always taken the view that the best policy is honesty. For every move to gain unfair advantage you run the risk of being identified as a cheat and penalised accordingly. Search Engines rule OK – and I don’t see that changing any time soon.

