Please use this identifier to cite or link to this item: https://scholarhub.balamand.edu.lb/handle/uob/6697
DC FieldValueLanguage
dc.contributor.advisorMokbel, Chaficen_US
dc.contributor.authorAdra, Miraen_US
dc.date.accessioned2023-03-07T08:27:24Z-
dc.date.available2023-03-07T08:27:24Z-
dc.date.issued2023-
dc.identifier.urihttps://scholarhub.balamand.edu.lb/handle/uob/6697-
dc.descriptionIncludes bibliographical references (p. 35-36)en_US
dc.description.abstractSpeech synthesis is experiencing a breakthrough as progressive leaps in artificial intelligence have led to a shift from the robotic standard voice to a more human-like voice with emotional inflections across multiple speakers and languages. Tacotron has been used intensively for such text-to-speech syntheses lately. Accordingly, in this thesis, I aim at studying the possibility of performing Multispeaker text-to-speech (TTS) transfer learning with Tacotron 2 in French to overcome the need of having multiple machines one per speaker. That is achieved by finetuning the Tacotron 2 training processor to allow learning the multiple speakers available in our dataset. For that, we use publicly available online French datasets that are already annotated. However, the main challenge that such models face is data efficiency and quality of the speaker audio files as well as speaker variability where each speaker might have a different accent or speaking rate. Despite that our model provided us with adequate results when presented with only a few hours of new speakers from different genders.en_US
dc.description.statementofresponsibilityby Mira Adraen_US
dc.format.extent1 online resource (36 pages) : ill., tablesen_US
dc.language.isoengen_US
dc.rightsThis object is protected by copyright, and is made available here for research and educational purposes. Permission to reuse, publish, or reproduce the object beyond the personal and educational use exceptions must be obtained from the copyright holderen_US
dc.subjectText to speech, Transfer learning, Multi-speaker, Tacotron 2, Frenchen_US
dc.subject.lcshSpeech synthesisen_US
dc.subject.lcshArtificial intelligenceen_US
dc.subject.lcshAutomatic speech recognitionen_US
dc.subject.lcshMachine learningen_US
dc.subject.lcshDissertations, Academicen_US
dc.subject.lcshUniversity of Balamand--Dissertationsen_US
dc.titleMulti speaker text to speech transfer learningen_US
dc.typeThesisen_US
dc.contributor.corporateUniversity of Balamanden_US
dc.contributor.departmentDepartment of Computer Engineeringen_US
dc.contributor.facultyFaculty of Engineeringen_US
dc.contributor.institutionUniversity of Balamanden_US
dc.date.catalogued2023-03-07-
dc.description.degreeMS in Computer Engineeringen_US
dc.description.statusPublisheden_US
dc.identifier.ezproxyURLhttp://ezsecureaccess.balamand.edu.lb/login?url=http://olib.balamand.edu.lb/projects_and_theses/301370.pdfen_US
dc.identifier.OlibID301370-
dc.provenance.recordsourceOliben_US
Appears in Collections:UOB Theses and Projects
Show simple item record

Record view(s)

121
checked on Nov 21, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.