Speaker adapted dynamic lexicons containing phonetic deviations of words

在线阅读 下载PDF 导出详情
摘要 Speakervariabilityisanimportantsourceofspeechvariationswhichmakescontinuousspeechrecognitionadifficulttask.Adaptingautomaticspeechrecognition(ASR)modelstothespeakervariationsisawell-knownstrategytocopewiththechallenge.AlmostallsuchtechniquesfocusondevelopingadaptationsolutionswithintheacousticmodelsoftheASRsystems.Althoughvariationsoftheacousticfeaturesconstituteanimportantportionoftheinter-speakervariations,theydonotcovervariationsatthephoneticlevel.Phoneticvariationsareknowntoformanimportantpartofvariationswhichareinfluencedbybothmicro-segmentalandsuprasegmentalfactors.Inter-speakerphoneticvariationsareinfluencedbythestructureandanatomyofaspeaker'sarticulatorysystemandalsohis/herspeakingstylewhichisdrivenbymanyspeakerbackgroundcharacteristicssuchasaccent,gender,age,socioeconomicandeducationalclass.Theeffectofinter-speakervariationsinthefeaturespacemaycauseexplicitphonerecognitionerrors.Theseerrorscanbecompensatedlaterbyhavingappropriatepronunciationvariantsforthelexiconentrieswhichconsiderlikelyphonemisclassificationsbesidespronunciation.Inthispaper,weintroducespeakeradaptivedynamicpronunciationmodels,whichgeneratedifferentlexiconsforvariousspeakerclustersanddifferentrangesofspeechrate.Themodelsarehybridsofspeakeradaptedcontextualrulesanddynamicgeneralizeddecisiontrees,whichtakeintoaccountwordphonologicalstructures,rateofspeech,unigramprobabilitiesandstresstogeneratepronunciationvariantsofwords.EmployingthesetofspeakeradapteddynamiclexiconsinaFarsi(Persian)continuousspeechrecognitiontaskresultsinworderrorratereductionsofasmuchas10.1%inaspeaker-dependentscenarioand7.4%inaspeaker-independentscenario.
机构地区 不详
出版日期 2009年10月20日(中国期刊网平台首次上网日期,不代表论文的发表时间)
  • 相关文献