Abstract
This paper describes an Icelandic pronunciation dictionary for speech applications and its processing for use in a text-to-speech system for Icelandic. Cleaning and correction procedures were implemented to create a consistent training set for grapheme-to-phoneme conversion modeling, needed for the automatic extension of the dictionary. Experiments with the original version of the dictionary and the cleaned version described in this paper as training sets for a joint sequence g2p algorithm show a clear benefit of using clean data for training, both in terms of PER and in terms of categories of errors made by the g2p algorithm. The results of the dictionary processing where also used to create an initial version of an open source database for Icelandic speech applications.
Original language | English |
---|---|
Title of host publication | 2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 339-345 |
Number of pages | 7 |
ISBN (Electronic) | 9781538643341 |
DOIs | |
Publication status | Published - 11 Feb 2019 |
Event | 2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Athens, Greece Duration: 18 Dec 2018 → 21 Dec 2018 |
Publication series
Name | 2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings |
---|
Conference
Conference | 2018 IEEE Spoken Language Technology Workshop, SLT 2018 |
---|---|
Country/Territory | Greece |
City | Athens |
Period | 18/12/18 → 21/12/18 |
Bibliographical note
Publisher Copyright:© 2018 IEEE.
Other keywords
- g2p
- Icelandic
- Pronunciation dictionary
- TTS