Amazon says that that a cloud-connected speaker/microphone was at the top of the charts: “This holiday season was better than ever for the family of Echo products. The Echo Dot was the #1 selling Amazon Device this holiday season, and the best-selling product from any manufacturer in any category across all of Amazon, with millions sold.”

The Echo products are an ever-expanding family of inexpensive consumer electronics from Amazon, which connect to a cloud-based service called Alexa. The devices are always listening for spoken commands, and will respond through conversation, playing music, turning on/off lights and other connected gadgets, making phone calls, and even by showing videos.

While Amazon doesn’t release sales figures for its Echo products, it’s clear that consumers love them. In fact, Echo is about to hit the road, as BMW will integrate the Echo technology (and Alexa cloud service) into some cars beginning this year. Expect other automakers to follow.

Why the Echo – and Apple’s Siri and Google’s Home? Speech.

The traditional way of “talking” to computers has been through the keyboard, augmented with a mouse used to select commands or input areas. Computers initially responded only to typed instructions using a command-line interface (CLI); this was replaced in the era of the Apple Macintosh and the first iterations of Microsoft Windows with windows, icons, menus, and pointing devices (WIMP). Some refer to the modern interface used on standard computers as a graphic user interface (GUI); embedded devices, such as network routers, might be controlled by either a GUI or a CLI.

Smartphones, tablets, and some computers (notably running Windows) also include touchscreens. While touchscreens have been around for decades, it’s only in the past few years they’ve gone mainstream. Even so, the primary way to input data was through a keyboard – even if it’s a “soft” keyboard implemented on a touchscreen, as on a smartphone.

Talk to me!

Enter speech. Sometimes it’s easier to talk, simply talk, to a device than to use a physical interface. Speech can be used for commands (“Alexa, turn up the thermostat” or “Hey Google, turn off the kitchen lights”) or for dictation.

Speech recognition is not easy for computers; in fact, it’s pretty difficult. However, improved microphones and powerful artificial-intelligence algorithms make speech recognition a lot easier. Helping the process: Cloud computing, which can throw nearly unlimited resources at speech recognition, including predictive analytics. Another helper: Constrained inputs, which means that when it comes to understanding commands, there are only so many words for the speech recognition system to decode. (Free-form dictation, like writing an essay using speech recognition, is a far harder problem.)

Speech recognition is only going to get better – and bigger. According to one report, “The speech and voice recognition market is expected to be valued at USD 6.19 billion in 2017and is likely to reach USD 18.30 billion by 2023, at a CAGR of 19.80% between 2017 and 2023. The growing impact of artificial intelligence (AI) on the accuracy of speech and voice recognition and the increased demand for multifactor authentication are driving the market growth.”

Helping the process: Cloud computing, which can throw nearly unlimited resources at speech recognition, including predictive analytics. Another helper: Constrained inputs, which means that when it comes to understanding commands, there are only so many words for the speech recognition system to decode. (Free-form dictation, like writing an essay using speech recognition, is a far harder problem.)

It’s a big market

Speech recognition is only going to get better – and bigger. According to one report, “The speech and voice recognition market is expected to be valued at USD 6.19 billion in 2017and is likely to reach USD 18.30 billion by 2023, at a CAGR of 19.80% between 2017 and 2023. The growing impact of artificial intelligence (AI) on the accuracy of speech and voice recognition and the increased demand for multifactor authentication are driving the market growth.” The report continues:

“The speech recognition technology is expected to hold the largest share of the market during the forecast period due to its growing use in multiple applications owing to the continuously decreasing word error rate (WER) of speech recognition algorithm with the developments in natural language processing and neural network technology. The speech recognition technology finds applications mainly across healthcare and consumer electronics sectors to produce health data records and develop intelligent virtual assistant devices, respectively.

“The market for the consumer vertical is expected to grow at the highest CAGR during the forecast period. The key factor contributing to this growth is the ability to integrate speech and voice recognition technologies into other consumer devices, such as refrigerators, ovens, mixers, and thermostats, with the growth of Internet of Things.”

Right now, many of us are talking to Alexa, talking to Siri, and talking to Google Home. Back in 2009, I owned a Ford car that had a primitive (and laughably inaccurate) infotainment system – today, a new car might do a lot better, perhaps due to embedded Alexa. Will we soon be talking to our ovens, to our laser printers and photocopiers, to our medical implants, to our assembly-line equipment, and to our network infrastructure? It wouldn’t surprise Alexa in the least.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.