Rohit Prabhavalkar, Tara N. Sainath, et al.
ICASSP 2013
The acoustic-modeling problem in automatic speech recognition is examined with the specific goal of unifying discrete and continuous parameter approaches. To model a sequence of information-bearing acoustic feature vectors which has been extracted from the speech waveform via some appropriate front-end signal processing, a speech recognizer basically faces two alternatives: a) assign a multivariate probability distribution directly to the stream of vectors, or b) use a time-synchronous labeling acoustic processor to perform vector quantization on this stream, and assign a multinomial probability distribution to the output of the vector quantizer. With few exceptions, these two methods have traditionally been given separate treatment. Here we consider a class of very general hidden Markov models which can accommodate feature vector sequences lying either in a discrete or in a continuous space; the new class allows one to represent the prototypes in an assumption limited, yet convenient way, as tied mixtures of simple multivariate densities. Speech recognition experiments, reported for two (5000- and 20 000-word vocabularly) office correspondence tasks, demonstrate some of the benefits associated with this technique. © 1990 IEEE
Rohit Prabhavalkar, Tara N. Sainath, et al.
ICASSP 2013
Arthur Nádas, David Nahamoo, et al.
IEEE Transactions on Acoustics, Speech, and Signal Processing
Eveline J. Ayme-Bellegarda, Tarek M. Habashy, et al.
SPIE Orlando 1991
Lalit R. Bahl, Peter F. Brown, et al.
INTERSPEECH - Eurospeech 1991