⚡ Listen to this voice and tell me: would you have been able to tell that it's artifical?
VALL-E can imitate any person based off of only a 3 second speech sample.
I find the implications mind boggling: any book can become an audio book in only seconds, fully automated. Any movie can be translated into any language in seconds.
What are some other use cases you're seeing for such perfect voice imitations?