Speech Recognition for Brazilian Portuguese using the Spoltech and OGI-22 Corpora
Carlos Silva, Nelson Neto, Aldebaro Klautau, Andre Adami, Isabel Trancoso
DOI: 10.14209/sbrt.2008.42724
Evento: XXVI Simpósio Brasileiro de Telecomunicações (SBrT2008)
Keywords: Speech recognition HMMs pronunciation dictionary
Abstract
Speech processing is a data-driven technology that relies on public corpora and associated resources. In contrast to languages such as English, there are few resources for Brazilian Portuguese (BP). This work describes efforts toward decreasing such gap and presents systems for speech recognition in BP using two public corpora: Spoltech and OGI-22. The following resources are made available: ATK and HTK scripts, pronunciation dictionary, language and acoustic models. The work discusses the baseline results obtained with these resources.Download