Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The prosody and and continuity of the speech is dramatically improved. This is hard to do and very impressive (especially given that it is being done on-device).

Personally, I'm less pleased with the actual new voice itself, although that is more a subjective judgment. After listening to many hundreds of voice talent auditions for Alexa, it's hard to step back from that level of pickiness.



As I indicated in another comment, the visual that the voice (together with other tweaks in some of Siri's responses) suggests to me is a perky twenty-something.

I actually tend to generally prefer some of the female British accents in several current TTS systems. (Amy is probably my favorite Polly voice.) Perhaps as an American, the robotic-ness doesn't seem quite as obvious or grating.


I also prefer the female British accents but that's exactly what excites me and is so awesome about this. These aren't just samples that are being stitched together anymore. This "learning" that is being done can be applied later to any of the voices in Apple's catalog. Once they get the data of the synthesis out there, they'll more than likely update all the languages and intonations to match. I would imagine that the biggest hurdle with this is that different languages and accents have different nuances. As with most things, they're just starting with English and then will move everything over to all the other options, including the British accents. I don't think we're too far off from a future where you'll be able to pick the age, gender, and voice of your assistant in the same way that characters selection is done in most modern video games.


How'd you get to listen to many hundreds of voice talent audition for Alexa?


I led the product management and design team.


Story time?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: