It seems that everyone has a virtual assistant these days. Two popular VAs are Siri from Apple and Alexa from Amazon. While there is no doubt text-to-speech has come a long way in the last few years, Google is about to change the game with Tacotron 2.
The text-to-speech project of Google is challenging both Siri and Alexa. The advancement of synthesized speech technology led to the rise of voice assistants, giving them a personality that actually let people befriend their devices. When voice assistants were introduced in the market, people were fascinated with what they could do for them — from getting directions, to sending messages, and telling them jokes.
The latest Tacotron 2 audio samples released by Google garnered a mean opinion score (MOS) of 4.53. This is incredible as the MOS score for a real human voice is 4.58. In addition, Tacotron 2 has the ability to figure out conversational context. It can put emphasis on words starting with capital letters and it has also been programmed to handle complex words. It adds appropriate stress when uttering words to convey proper interpretation, such as when asking a question. Pauses and breaths can also be heard since the algorithm used by the engineers at Google has the capability to learn a variety of voices.
Projects like Tacotron 2 demonstrate how technology is influencing the way we live and work, and how much we can all benefit from these developments.
Cloudstaff fully embraces the use of technology to create the tools that are redefining the outsourcing industry. We are sure that our engineers are excited about what these new developments will bring to the products they create.
Take time to listen: https://google.github.io/tacotron/publications/tacotron2/index.html
Suggested good reads: