Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder¶

Code link

Naturalness test samples - no accent conversion vs accent conversion¶

These samples were synthesized using the averaged representation of speakers and accents. First set is without accent conversion, the second (Conv) is with accent conversion.

Utterance 1: He will knock you off a few sticks in no time. Utterance 2: I graduated last of my class. Utterance 3: For the twentieth time that evening the two men shook hands. Utterance 4: I will go over tomorrow afternoon.

Ground Truth CVAE-NL CVAE-L GST GMVAE Conv CVAE-NL Conv CVAE-L Conv GST Conv GMVAE
Speaker: ABA (Arabic) alternative text
Speaker: HKK (Korean) alternative text
Speaker: NCC (Chinese) alternative text
Speaker: SVBI (Hindi) alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text

Accent conversion task¶

These samples were converted to the target accent.

Utterance 1: For the twentieth time that evening the two men shook hands. Utterance 2: And you always want to see it in the superlative degree. Utterance 3: I will go over tomorrow afternoon.

Source Ground Truth CVAE-NL CVAE-L GST GMVAE
Speaker:THV (Vietnamese) Accent: Arabic alternative text
Speaker:THV (Vietnamese) Accent: Hindi alternative text
Speaker:NCC (Chinese) Accent: Hindi alternative text
Speaker:NCC (Chinese) Accent: Spanish alternative text
Speaker:EBVS (Spanish) Accent: Chinese alternative text
Speaker:EBVS (Spanish) Accent: Korean alternative text
Speaker:HKK (Korean) Accent: Arabic alternative text
Speaker:HKK (Korean) Accent: Spanish alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text
alternative text