On the use of higher frame rate in the training phase of ASR

Pekar, Darko; Jakovljević, Nikša; Janev, Marko; Mišković, Dragiša; Delić, Vlado

DC Field	Value	Language
dc.contributor.author	Pekar, Darko	en
dc.contributor.author	Jakovljević, Nikša	en
dc.contributor.author	Janev, Marko	en
dc.contributor.author	Mišković, Dragiša	en
dc.contributor.author	Delić, Vlado	en
dc.date.accessioned	2020-04-27T10:55:18Z	-
dc.date.available	2020-04-27T10:55:18Z	-
dc.date.issued	2010-12-01	en
dc.identifier.isbn	978-9-604-74201-1	en
dc.identifier.uri	http://researchrepository.mi.sanu.ac.rs/handle/123456789/907	-
dc.description.abstract	The number of observations which are the basis for parameter estimation plays an important role in the quality of acoustic models. HMM based automatic speech recognition (ASR) systems generally have to cope with an insufficient number of observations for a good estimate. One way of tackling this problem is a well known procedure of state-tying, which is performed in order to gather sufficient information for a reasonable estimate for a large number of models. This procedure introduces an additional bias into the estimates, often leading to poor recognition results. In this paper a simple alternative to that solution is offered. It should be noted that most existing ASR systems use the same frame step size of 10ms in the training of the acoustical models, justifying it with the fact that speech signals exhibit quasi-stationary behavior at shorter durations. We claim that it is fully acceptable to adopt a much smaller frame step size in the acoustical training, thus providing estimators with a significantly higher number of observations compared to the standard 10ms case. This results in better parameter estimates and consequently better recognition results. Beside being justifiable from a phonetical point of view, it is also supported by results of an experimental on a real ASR system.	en
dc.relation.ispartof	International Conference on Computers - Proceedings	en
dc.subject	ASR \| GMM \| Kernel smoothing \| KL divergence \| Parameter estimation \| Variable frame rate	en
dc.title	On the use of higher frame rate in the training phase of ASR	en
dc.type	Conference Paper	en
dc.relation.conference	14th WSEAS International Conference on Computers, Part of the 14th WSEAS CSCC Multiconference; Corfu Island; Greece; 23 July 2010 through 25 July 2010	-
dc.identifier.scopus	2-s2.0-79958698458	en
dc.relation.firstpage	127	en
dc.relation.lastpage	130	en
dc.relation.volume	1	en
item.fulltext	No Fulltext	-
item.openairetype	Conference Paper	-
item.grantfulltext	none	-
item.cerifentitytype	Publications	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
crisitem.author.orcid	0000-0003-3246-4988	-

Show simple item record

Page view(s)

118

checked on Jul 13, 2026

Google Scholar^TM

Check

Page view(s)

Google ScholarTM

Altmetric

Google Scholar^TM