Learning interpretable control dimensions for speech synthesis by using external data

The following speech samples relate to a paper submitted for review at INTERSPEECH 2018

Control of emotion

Here we set the control vector to one-hot vectors for each emotion

Emotion
Angry
Happy
Sad
Neutral

Paragraphs

We present 3 systems, for pairwise paragraph comparison

System
DNN-C
DNN-R
DNN-B