When voice technology began to emerge in 2011 with the introduction of Siri, no one could have predicted that this novelty would become a driver for tech innovation. Now, a decade later, it’s estimated that every 1 in 4 U.S. adults own a smart speaker (i.e., Google Home, Amazon Echo) and eMarketer forecasts that nearly 92.3 percent smartphone users will be using voice assistants by 2023.
Brands such as Amazon, and Google are continuing to fuel this trend as they compete for market share. Voice interfaces are advancing at an exponential rate in all industries, with notable growth in healthcare to banking, as companies are racing to release their own voice technology integrations to keep pace with consumer demand.
The main driver for this shift towards voice user interfaces is changing user demands. There is an increased overall awareness and a higher level of comfort demonstrated specifically by millennial consumers. In this ever-evolving digital world where speed, efficiency, and convenience are constantly being optimized.
The mass adoption of artificial intelligence in users’ everyday lives is also fueling the shift towards voice applications. The number of IoT devices such as smart thermostats, appliances, and speakers are giving voice assistants more utility in a connected user’s life. Smart speakers are the number one way we are seeing voice being used, however, it only starts there. Many industry experts even predict that nearly every application will integrate voice technology in some way in the next 5 years.
Applications of this technology are seen everywhere, so where will it take us in 2021 and beyond? We provide a high-level overview of the potential that voice has and 7 key predictions we think will take off in the coming years.
Integrating voice-tech into mobile apps has become the hottest trend right now, and will remain so because voice is a natural user interface (NUI).
Voice-powered apps increase functionality, and save users from complicated app navigation. Voice-activated apps make it easier for the end-user to navigate an app — even if they don’t know the exact name of the item they’re looking for or where to find it in the app’s menu. While at this stage, voice integration may be seen as a nice-to-have by users, this will soon become a requirement that users will expect.
In 2020, AI-powered chatbots and virtual assistants played a vital role in the fight against COVID-19. Chatbots helped screen and triage patients, and Apple’s Siri now walks users through CDC COVID-19 assessment questions and then recommends telehealth apps. Voice and conversational AI have made health services more accessible to everyone who was unable to leave their home during COVID-19 restrictions. Now that patients have a taste for what is possible with voice and healthcare, behaviors are not likely to go back to re-pandemic norms. Be prepared to see more investment in voice-tech integration in the healthcare industry in the years to come.
Voice search has been a hot topic of discussion. Visibility of voice will undoubtedly be a challenge. This is because the visual interface with voice assistants is missing. Users simply cannot see or touch a voice interface unless it is connected to the Alexa or Google Assistant app. Search behaviors, in turn, will see a big change. In fact, if tech research firm Juniper Research is correct, voice-based ad revenue could reach $19 billion by 2022, thanks in large part to the growth of voice search apps on mobile devices.
Brands are now experiencing a shift in which touchpoints are transforming to listening points, and organic search will be the main way in which brands have visibility. As voice search grows in popularity, advertising agencies and marketers expect Google and Amazon will open their platforms to additional forms of paid messages.
Voice assistants will also continue to offer more individualized experiences as they get better at differentiating between voices. Google Home is able to support up to six user accounts and detect unique voices, which allows Google Home users to customize many features. Users can ask “What’s on my calendar today?” or “tell me about my day?” and the assistant will dictate commute times, weather, and news information for individual users. It also includes features such as nicknames, work locations, payment information, and linked accounts such as Google Play, Spotify, and Netflix. Similarly, for those using Alexa, simply saying “learn my voice” will allow users to create separate voice profiles so the technology can detect who is speaking for more individualized experiences.
Machine learning tech and GPU power development commoditize custom voice creation and make the speech more emotional, which makes this computer-generated voice indistinguishable from the real one. You just use a recorded speech and then a voice conversion technology transforms your voice into another. Voice cloning becomes an indispensable tool for advertisers, filmmakers, game developers, and other content creators.
Last year smart displays were on the rise as they expanded voice-tech’s functionality. Now, the demand for these devices is even higher, with consumers showing a preference for smart displays over regular smart speakers. In the third quarter of 2020, the sales of smart displays rose year-on-year by 21 percent to 9.5 million units, while basic smart speakers fell by three percent. In 2021, we expect for there to be a lot of innovation in the world of smart displays to integrate more advanced technology and more customization. Smart displays, like the Russian Sber portal or a Chinese smart screen Xiaodu, for example, are already equipped with a suite of upgraded AI-powered functions, including far-field voice interaction, facial recognition, hand gesture control, and eye gesture detection.
It takes a lot of time and effort to record a voice for spoken dialogues within the game for each of the characters. In the upcoming year, developers will be able to use sophisticated neural networks to mimic human voices. In fact, looking a little bit ahead, neural networks will be able to even create appropriate NPC responses. Some game design studios and developers are working hard to create and embed this dialogue block into their tools, so seeing games include dynamic dialogues isn’t too far off.
Mobile phones are already personalized, more so than any website. Additionally, there is very little screen space on mobile, making it more difficult for users to search, or navigate. With larger product directories and more information, voice applications enable consumers to use natural language to eliminate or reduce manual effort, making it a lot faster to accomplish tasks.
Rogers has introduced voice commands to their remotes allowing customers to quickly browse and find their favorite shows or the latest movies with certain keywords, for example, an actor’s name. Brands need to focus on better mobile experiences for their consumers and voice is the way to do so. Users are searching for quicker and more efficient ways of accomplishing tasks and voice is quickly becoming the ideal channel for this.
Whether that’s finding out information, making a purchase, or achieving a task, voice is the new mobile experience. With the voice and speech recognition market is expected to grow at a 17.2 percent CAGR to reach $26.8 billion by 2025, It’s clear that brands are racing to figure out their voice strategy.
Even with just that handful of simple scenarios, it’s easy to see why voice assistants are shaping up to become the hubs of our connected homes and increasingly connected lives.
Voice technology is becoming increasingly accessible to developers. For example, Amazon offers Transcribe, an automatic speech recognition (ASR) service that enables developers to add speech-to-text capability to their applications. Once the voice capability is integrated into the application, users can analyze audio files and in return, receive a text file of the transcribed speech.
Google has made moves in making Assistant more ubiquitous by opening the software development kit through Actions, which allows developers to build voice into their own products that support artificial intelligence. Another one of Google’s speech-recognition products is the AI-driven Cloud Speech-to-Text tool which enables developers to convert audio to text through deep learning neural network algorithms.
This is only the beginning of voice technology as we will see major advancements in the user interface in the years to come. With the advancements in VUI, companies need to start educating themselves on how they can best leverage voice to better interact with their customers. It’s important to ask what the value of adding voice will be as it doesn’t always make sense for every brand to adopt. How can you provide value to your customers? How are you solving their pain points with voice? Will voice enhance the user experience or frustrate the user?
In 2021, voice-enabled apps will not only accurately understand what we are saying, but how we are saying it and the context in which the inquiry is made.
However, there are still a number of barriers that need to be overcome before voice applications will see mass adoption. Technological advances are making voice assistants more capable particularly in AI, natural language processing (NLP), and machine learning. To build a robust speech recognition experience, the artificial intelligence behind it has to become better at handling challenges such as accents and background noise. And as consumers are becoming increasingly more comfortable and reliant upon using voice to talk to their phones, cars, smart home devices, etc., voice technology will become a primary interface to the digital world and with it, expertise for voice interface design and voice app development will be in greater demand.
Advancements in a number of industries are helping digital voice assistants become more sophisticated and useful for everyday use. Voice has now established itself as the ultimate mobile experience. A lack of skills and knowledge make it particularly hard for companies to adopt a voice strategy. There is a lot of opportunity for much deeper and much more conversational experiences with customers. The question is, is your brand willing to jump on this opportunity?