Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0

The following are 20 renditions from our proposed system, VAEVAMP, for the sentence:

"I'm sorry."