Please use this identifier to cite or link to this item:
Title: Broadcast news transcription baseline system using the nemlar database
Authors: Bayeh, Rania
Mokbel, Chafic 
Chollet, Gérard
Affiliations: Department of Electrical Engineering 
Issue Date: 2008
Conference: International conference on Language Resources and Evaluation (6th : 28-30 May 2008 : Morocco) 
This paper describes one of the first uses of the NEMLAR Arabic Broadcast News Speech Corpus (BNSC) for the creation of an automatic speech recognizer (ASR) for Arabic Broadcast News (BN). Different parameterization settings, types of acoustic models, various language models and testing schemes are presented for the creation of a baseline system for Modern Standard Arabic using the NEMLAR BNSC database. To port this system to dialects, a certain amount of dialectal data is required. Due to the absence of such resources and the use of other languages in dialectal speech, techniques for the creation of cross-lingual models using the baseline system are investigated. Certain techniques that have given promising results in previous experiments are proposed. These techniques, which would be helpful in developing a cross-dialectal speech recognition system, have been, due to the use of Maghrebian/Levantine dialects which make use of French, experimented in a cross-lingual Arabic-French frame. Although a lot of work remains to be accomplished, the current results are very encouraging.
Type: Conference Paper
Appears in Collections:Department of Electrical Engineering

Show full item record

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.