Skip to content

Putting Voice Recognition to Work

If 2019 was the year in which voice recognition systems went mainstream in the home, I wonder if 2020 will be the year when voice transcription really takes off in business. There are several services now jostling for position, some seeing themselves as offering a whole new way of working. The founder of one service I’ve been trying, Otter.ai, has described it as being “similar to Slack except that we’re focused on voice communication rather than text”, and that’s a lot more ambitious than just transcription.

If this is all new to you, I’d recommend giving straightforward transcription a go. Other names to investigate, depending on what you want, include Reason8 and Trint, as well as various smartphone apps such as Google’s Recorder. Otter.ai has a generous free offering which takes just moments to set up; to test its simple speech to text capabilities for the first time, I just played an MP3 file of a radio interview from my laptop and watched as it got to work. It’s really interesting to see it process words and then go back and modify them once an understanding of a complete phrase allows more accurate transcription.

Of course it’s not perfect, but it’s definitely a massive time saver if you need to summarise a conversation or presentation, and just want something to get you started. If you need a perfectly accurate transcription, you may be able to correct things in real time on a first listen back, which could save hours of work. As I’ve mentioned many times, all videos should be published with a transcription in the notes, and this technology is a real help in doing that.

But it’s the ‘whole new way of working’ that intrigues me. Where this comes in is to have discussions recorded and the text transcriptions used primarily as a searchable index. Imagine if the ‘recent calls’ screen on your phone was a list of recordings, such that you could type in a word or phrase and be taken immediately to hear the point in any of the conversations where that term was used. You may be able to work out where things could be going here.