Tara N. Sainath, I-Hsin Chung, et al.
INTERSPEECH 2014
The eleven papers in this special section focus on end-to-end speech and language processing (SLP) which is a series of sequence-to-sequence learning problems. Conventional SLP systems map input to output sequences through module-based architectures where each module is independently trained. These have a number of limitations including local optima, assumptions about intermediate models and features, and complex expert knowledge driven steps. It can be difficult for non-experts to use and develop new applications. Integrated End-to-End (E2E) systems aim to simplify the solution to these problems through a single network architecture to map an input sequence directly to the desired output sequence without the need for intermediate module representations. E2E models rely on flexible and powerful machine learning models such as recurrent neural networks. The emergence of models for end-to-end speech processing has lowered the barriers to entry into serious speech research. This special issue showcases the power of novel machine learning methods in end-to-end speech and language processing.
Tara N. Sainath, I-Hsin Chung, et al.
INTERSPEECH 2014
Abdel-Rahman Mohamed, Tara N. Sainath, et al.
ICASSP 2011
Daniel Povey, Dimitri Kanevsky, et al.
ICASSP 2008
Yinghui Huang, Abhinav Sethy, et al.
ICASSP 2018