Helping machines understand things in their own way

It was about 15 years ago, at a presentation by someone from Google on their fledgling ‘translate’ service, that I first realised how the company was going to change the world through data analysis. Google Translate, he said, was not going to work by just looking up words in a dictionary. It would work out what phrases meant, and how they might be translated, by comparing documents in different languages, and… at that point it started to get over my head. However, it gave me an inkling of the way in which the almost incomprehensibly clever people there think. Everything’s all about patterns and connections, and if you’ve got the biggest database in the world of what people are saying, that’s the way to do it.

So it shouldn’t be surprising that Google can understand ‘concepts’ nearly as well as it understands words. If there are interchangeable terms in your field of activity, Google probably understands this. If a dozen suppliers only sell products A, B and C, and another only sell A, B and D, there’s a good chance that C and D are the same thing. Reference articles which mention both will confirm this.

This all helps Google become more and more like a human reader, its overriding and ultimate aim. To ‘do well’ in Google, your text needs to be written in a conventionally-structured, easy-to-read style that humans will appreciate: think quality newspaper article, not James Joyce novel or scientific PhD. It’s also another reason why substantial content will always beat a couple of sentences. It gives the data-crunching algorithms more chance to match patterns and understand things in their own way.