Samuel Thomas  Samuel Thomas photo         

contact information

Speech Recognition
Thomas J. Watson Research Center, Yorktown Heights, NY USA
  +1dash914dash945dash1142

links



2021

Auxiliary Networks for Joint Speaker Adaptation and Speaker Change Detection
Leda Sari, Mark Hasegawa-Johnson, Samuel Thomas
IEEE Transactions on Audio, Speech, and Language Processing29, 324-333, IEEE, 2021


2020

End-to-End Spoken Language Understanding Without Full Transcripts
Hong-Kwang J Kuo, Zoltan Tuske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis Lastras
Interspeech, 2020

Resource-adaptive Deep Learning for Visual Speech Recognition
Alexandros Koumparoulis, Gerasimos Potamianos, Samuel Thomas, Edmilson da Silva Morais
Interspeech, 2020

Transliteration Based Data Augmentation for Training Multilingual ASR Acoustic Models in Low Resource Settings
Samuel Thomas, Kartik Audhkhasi, Brian Kingsbury
Interspeech, 2020

Implicit Transfer of Privileged Acoustic Information in a Generalized Knowledge Distillation Framework
Takashi Fukuda, Samuel Thomas
Interspeech, 2020

Audio-Assisted Image Inpainting for Talking Faces
Alexandros Koumparoulis, Gerasimos Potamianos, Samuel Thomas, Edmilson da Silva Morais
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Training Spoken Language Understanding Systems with Non-Parallel Speech and Text
Leda Sari, Samuel Thomas, Mark Hasegawa-Johnson
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Leveraging Unpaired Text Data for Training End-To-End Speech-to-Intent Systems
Yinghui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, Michael Picheny
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)


2019

Mixed Bandwidth Acoustic Modeling Leveraging Knowledge Distillation
Takashi Fukuda, Samuel Thomas
2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 509-515

Semi-Supervised Training and Data Augmentation for Adaptation of Automatic Broadcast News Captioning Systems
Yinghui Huang, Samuel Thomas, Masayuki Suzuki, Zoltan Tuske, Larry Sansone, Michael Picheny
2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 867-874

Simplified LSTMS for Speech Recognition
George Saon, Zoltan Tuske, Kartik Audhkhasi, Brian Kingsbury, Michael Picheny, Samuel Thomas
2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 547-553

Grounding Spoken Words in Unlabeled Video
Angie Boggust, Kartik Audhkhasi, Dhiraj Joshi, David Harwath, Samuel Thomas, Rogerio Feris, Dan Gutfreund, Yang Zhang, Antonio Torralba, Michael Picheny, others
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 29--32, 2019

Learning Speaker Aware Offsets for Speaker Adaptation of Neural Networks
Leda Sari, Samuel Thomas, Mark Hasegawa-Johnson
Proc. Interspeech 2019, 769--773

Detection and Recovery of OOVs for Improved English Broadcast News Captioning
Samuel Thomas, Kartik Audhkhasi, Zoltan Tuske, Yinghui Huang, Michael Picheny
Proc. Interspeech 2019, 2973--2977

English Broadcast News Speech Recognition by Humans and Machines
Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltan Tuske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, others
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6455--6459

Pre-training of Speaker Embeddings for Low-latency Speaker Change Detection in Broadcast News
Leda Sari, Samuel Thomas, Mark Hasegawa-Johnson, Michael Picheny
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6286--6290

Improvements to N-gram Language Model Using Text Generated from Neural Language Model
Masayuki Suzuki, Nobuyasu Itoh, Tohru Nagano, Gakuto Kurata, Samuel Thomas
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7245--7249


2018

Understanding unequal gender classification accuracy from face images
Vidya Muthukumar, Tejaswini Pedapati, Nalini Ratha, Prasanna Sattigeri, Chai-Wah Wu, Brian Kingsbury, Abhishek Kumar, Samuel Thomas, Aleksandra Mojsilovic, Kush R Varshney
arXiv preprint arXiv:1812.00099, 2018

SimplerVoice: A Key Message and Visual Description Generator System for Illiteracy
Minh NB Nguyen, Samuel Thomas, Anne E Gattiker, Sujatha Kashyap, Kush R Varshney
arXiv preprint arXiv:1811.01299, 2018

Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition
Xuesong Yang, Kartik Audhkhasi, Andrew Rosenberg, Samuel Thomas, Bhuvana Ramabhadran, Mark Hasegawa-Johnson
2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5989-5993

Inference-Invariant Transformation of Batch Normalization for Domain Adaptation of Acoustic Models
Masayuki Suzuki, Tohru Nagano, Gakuto Kurata, Samuel Thomas
Interspeech 2018, pp. 2893-2897

Data Augmentation Improves Recognition of Foreign Accented Speech
Takashi Fukuda, Raul Fernandez, Andrew Rosenberg, Samuel Thomas, Bhuvana Ramabhadran, Alexander Sorin, Gakuto Kurata
Interspeech 2018, pp. 2409-2413

A Recorded Debating Dataset
Mirkin, Shachar and Jacovi, Michal and Lavee, Tamar and Kuo, Hong-Kwang and Thomas, Samuel and Sager, Leslie and Kotlerman, Lili and Venezian, Elad and Slonim, Noam
Proceedings of LREC, 2018


2017

Efficient Knowledge Distillation from an Ensemble of Teachers
Fukuda, Takashi and Suzuki, Masayuki and Kurata, Gakuto and Thomas, Samuel and Cui, Jia and Ramabhadran, Bhuvana
Proc. Interspeech 2017, 3697--3701

Effective Joint Training of Denoising Feature Space Transforms and Neural Network Based Acoustic Models
Takashi Fukuda, Osamu Ichikawa, Gakuto Kurata, Ryuki Tachibana, Samuel Thomas, Bhuvana Ramabhadran
proc. of ICASSP, pp. pp. 5190-5194, 2017

English Conversational Telephone Speech Recognition by Humans and Machines
Saon, George and Kurata, Gakuto and Sercu, Tom and Audhkhasi, Kartik and Thomas, Samuel and Dimitriadis, Dimitrios and Cui, Xiaodong and Ramabhadran, Bhuvana and Picheny, Michael and Lim, Lynn-Li and others
arXiv preprint arXiv:1703.02136, 2017


2016

Domain Adaptation of CNN Based Acoustic Models Under Limited Resource Settings.
Suzuki, Masayuki and Tachibana, Ryuki and Thomas, Samuel and Ramabhadran, Bhuvana and Saon, George
INTERSPEECH, pp. 1588--1592, 2016

Multilingual Data Selection For Low Resource Speech Recognition
Thomas, Samuel and Audhkhasi, Kartik and Cui, Jia and Kingsbury, Brian and Ramabhadran, Bhuvana
2016

Domain Adaptation of CNN Based Acoustic Models Under Limited Resource Settings
Suzuki, Masayuki and Tachibana, Ryuki and Thomas, Samuel and Ramabhadran, Bhuvana and Saon, George
Interspeech 2016, 1588--1592

An Investigation on the Use of i-vectors for Robust ASR
Dimitriadis, Dimitrios and Thomas, Samuel and Ganapathy, Sriram
Interspeech 2016, 3828--3832

CNMF-based acoustic features for noise-robust ASR
Vaz, Colin and Dimitriadis, Dimitrios and Thomas, Samuel and Narayanan, Shrikanth
Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on, pp. 5735--5739

Invariant Representations for Noisy Speech Recognition
Serdyuk, Dmitriy and Audhkhasi, Kartik and Brakel, Phil{'e}mon and Ramabhadran, Bhuvana and Thomas, Samuel and Bengio, Yoshua
arXiv preprint arXiv:1612.01928, 2016

On the importance of event detection for ASR
Haws, David and Dimitriadis, Dimitrios and Saon, George and Thomas, Samuel and Picheny, Michael
Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on, pp. 5705--5709


2015

The IBM BOLT speech transcription system
Samuel Thomas, George Saon, Hong-Kwang Jeff Kuo, Lidia Mangu
INTERSPEECH, pp. 3150-3153, 2015
Abstract   egyptian arabic, convolutional neural network, ibm, machine translation, language model, word error rate, artificial neural network, speech recognition, computer science

Investigating factor analysis features for deep neural networks in noisy speech recognition
Sriram Ganapathy, Samuel Thomas, Dimitrios Dimitriadis, Steven J. Rennie
INTERSPEECH, pp. 1898-1902, 2015
time delay neural network, convolutional neural network, test data, word error rate, channel, artificial neural network, speech recognition, pattern recognition, artificial intelligence, computer science

The IBM BOLT speech transcription system.
Thomas, Samuel and Saon, George and Kuo, Hong-Kwang Jeff and Mangu, Lidia
INTERSPEECH, pp. 3150--3153, 2015

Investigating factor analysis features for deep neural networks in noisy speech recognition.
Ganapathy, Sriram and Thomas, Samuel and Dimitriadis, Dimitrios and Rennie, Steven J
INTERSPEECH, pp. 1898--1902, 2015
Abstract

Improvements to the IBM speech activity detection system for the DARPA RATS program
Thomas, Samuel and Saon, George and Van Segbroeck, Maarten and Narayanan, Shrikanth S
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, pp. 4500--4504
Abstract


2014

Robust language identification using convolutional neural network features.
Ganapathy, Sriram and Han, Kyu Jeong and Thomas, Samuel and Omar, Mohamed Kamal and Van Segbroeck, Maarten and Narayanan, Shrikanth S
INTERSPEECH, pp. 1846--1850, 2014

Robust Language Identification Using Convolutional Neural Network Features
Sriram Ganapathy, Kyu Han, Samuel Thomas, Mohamed Omar, Maarten Van Segbroeck, Shrikanth S Narayanan
Fifteenth Annual Conference of the International Speech Communication Association, 2014

Deep order statistic networks
Steven Rennie, Vaibhava Goel, Samuel Thomas
Proc. of the IEEE Workshop on Spoken Language Technology (SLT), 2014

Annealed dropout training of deep networks
Steven Rennie, Vaibhava Goel, Samuel Thomas
Spoken Language Technology (SLT), IEEE Workshop on. IEEE, 2014

Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions
Samuel Thomas, Sriram Ganapathy, George Saon, Hagen Soltau
2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2519--2523


2013

A Summary of the 2012 JHU CLSP workshop on zero resource speech technologies and models of early language acquisition| Macquarie University ResearchOnline
Aren Jansen, Emmanuel Dupoux, Mike Seltzer, Pascal Clark, Ian McGraw, Balakrishnan Varadarajan, Erin Bennett, Benjamin Borschinger, Justin Chiu, Ewan Dunbar, others
ICASSP, 2013

Data-driven Neural Network Based Feature Front-ends for Automatic Speech Recognition
Samuel Thomas
2013 - old-site.clsp.jhu.edu

Developing a speaker identification system for the DARPA RATS project.
Oldrich Plchot, Spyros Matsoukas, Pavel Matejka, Najim Dehak, Jeff Z Ma, Sandro Cumani, Ondrej Glembek, Hynek Hermansky, Sri Harish Reddy Mallidi, Nima Mesgarani, others
ICASSP, pp. 6768--6772, 2013

Weak top-down constraints for unsupervised acoustic model training.
Aren Jansen, Samuel Thomas, Hynek Hermansky
ICASSP, pp. 8091--8095, 2013

The IBM speech activity detection system for the DARPA RATS program.
George Saon, Samuel Thomas, Hagen Soltau, Sriram Ganapathy, Brian Kingsbury
INTERSPEECH, pp. 3497--3501, 2013

Deep neural network features and semi-supervised training for low resource speech recognition
Samuel Thomas, Michael L Seltzer, Kenneth Church, Hynek Hermansky
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, pp. 6704--6708

A summary of the 2012 JHU CLSP workshop on zero resource speech technologies and models of early language acquisition.
Aren Jansen, Emmanuel Dupoux, Sharon Goldwater, Mark Johnson, Sanjeev Khudanpur, Kenneth Church, Naomi Feldman, Hynek Hermansky, Florian Metze, Richard C Rose, others
ICASSP, pp. 8111--8115, 2013


2012

Feature extraction using 2-d autoregressive models for speaker recognition.
Ganapathy, Sriram and Thomas, Samuel and Hermansky, Hynek
Odyssey, pp. 229--235, 2012
Abstract

Adaptation transforms of auto-associative neural networks as features for speaker verification.
Thomas, Samuel and Mallidi, Sri Harish Reddy and Ganapathy, Sriram and Hermansky, Hynek
Odyssey, pp. 98--104, 2012
Abstract

Adaptation transforms of auto-associative neural networks as features for speaker verification
Samuel Thomas, Sri Harish Mallidi, Sriram Ganapathy, Hynek Hermansky
Proceedings of Odyssey, pp. 98--104, 2012

Exploiting Discriminative Point Process Models for Spoken Term Detection.
Atta Norouzian, Aren Jansen, Richard C Rose, Samuel Thomas
INTERSPEECH, 2012

Feature extraction using 2-d autoregressive models for speaker recognition
Sriram Ganapathy, Samuel Thomas, Hynek Hermansky
ISCA Speaker Odyssey, Citeseer, 2012

The UMD-JHU 2011 speaker recognition system
Daniel Garcia-Romero, Xinhui Zhou, D Zotkin, B Srinivasan, Yuancheng Luo, Sriram Ganapathy, Samuel Thomas, S Nemala, Garimella SVS Sivaram, Majid Mirbagheri, others
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, pp. 4229--4232

Intrinsic Spectral Analysis for Zero and High Resource Speech Recognition.
Aren Jansen, Samuel Thomas, Hynek Hermansky
INTERSPEECH, 2012

Data-driven posterior features for low resource speech recognition applications
Samuel Thomas, Sriram Ganapathy, Aren Jansen, Hynek Hermansky
Interspeech, 2012

Acoustic and Data-driven Features for Robust Speech Activity Detection.
Samuel Thomas, Sri Harish Reddy Mallidi, Thomas Janu, Hynek Hermansky, Nima Mesgarani, Xinhui Zhou, Shihab A Shamma, Tim Ng, Bing Zhang, Long Nguyen, others
INTERSPEECH, 2012

Multilingual MLP features for low-resource LVCSR systems
Samuel Thomas, Sriram Ganapathy, Hynek Hermansky
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, pp. 4269--4272


2011

The subspace Gaussian mixture model—A structured model for speech recognition
Povey, Daniel and Burget, Luk{'a}{v{s}} and Agarwal, Mohit and Akyazi, Pinar and Kai, Feng and Ghoshal, Arnab and Glembek, Ond{v{r}}ej and Goel, Nagendra and Karafi{'a}t, Martin and Rastrow, Ariya and others
Computer Speech & Language 25(2), 404--439, Elsevier, 2011
Abstract

Mesgarani, JASA--EL Toward optimizing stream fusion in multistream recognition of speech Running title: Multistream speech recognition
Mesgarani, Nima and Thomas, Samuel and Hermansky, Hynek
Journal of Acoustical Society of America - Express Letters, Citeseer, 2011
Abstract

Performance monitoring for robustness in automatic recognition of speechi.
Hynek Hermansky, Nima Mesgarani, Samuel Thomas
MLSLP, pp. 31--34, 2011

MLP based phoneme detectors for automatic speech recognition
Samuel Thomas, Patrick Nguyen, Geoffrey Zweig, Hynek Hermansky
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, pp. 5024--5027

Mixture of Auto-Associative Neural Networks for Speaker Verification.
Garimella SVS Sivaram, Samuel Thomas, Hynek Hermansky
INTERSPEECH, pp. 2381--2384, 2011

Toward optimizing stream fusion in multistream recognition of speech
Nima Mesgarani, Samuel Thomas, Hynek Hermansky
The Journal of the Acoustical Society of America 130(1), EL14--EL18, Acoustical Society of America, 2011

Adaptive Stream Fusion in Multistream Recognition of Speech.
Nima Mesgarani, Samuel Thomas, Hynek Hermansky
INTERSPEECH, pp. 2329--2332, 2011

Rapid Evaluation of Speech Representations for Spoken Term Discovery.
Michael A Carlin, Samuel Thomas, Aren Jansen, Hynek Hermansky
INTERSPEECH, pp. 821--824, 2011

Speech recognitionwith segmental conditional random fields: A summary of the JHU CLSP 2010 summer workshop
Geoffrey Zweig, Patrick Nguyen, Dirk Van Compernolle, Kris Demuynck, Les Atlas, Pascal Clark, Gregory Sell, Meihong Wang, Fei Sha, Hynek Hermansky, others
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, pp. 5044--5047

The subspace Gaussian mixture model - A structured model for speech recognition
Daniel Povey, Luk\'a\v{s} Burget, Mohit Agarwal, Pinar Akyazi, Feng Kai, Arnab Ghoshal, Ond\v{r}ej Glembek, Nagendra Goel, Martin Karafi\'at, Ariya Rastrow, others
Computer Speech \& Language 25(2), 404--439, Elsevier, 2011


2010

Speech recognition with segmental conditional random fields: final report from the 2010 JHU summer workshop
Geoffrey Zweig, Patrick Nguyen, Dirk Van Compernolle, Kris Demuynck, Hynek Hermansky, Damianos Karakos, Keith Kintzley, Samuel Thomas, Sivaram GSVS, Sam Bowman, others
Technical Report, Technical Report MSR-TR-2010-173, Microsoft Reasearch, 2010. 112, 113

A phoneme recognition framework based on auditory spectro-temporal receptive fields.
Samuel Thomas, Kailash Patil, Sriram Ganapathy, Nima Mesgarani, Hynek Hermansky
INTERSPEECH, pp. 2458--2461, 2010

Robust spectro-temporal features based on autoregressive models of hilbert envelopes
Sriram Ganapathy, Samuel Thomas, Hynek Hermansky
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pp. 4286--4289

Comparison of modulation features for phoneme recognition
Sriram Ganapathy, Samuel Thomas, Hynek Hermansky
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pp. 5038--5041

Temporal envelope compensation for robust phoneme recognition using modulation spectrum
Sriram Ganapathy, Samuel Thomas, Hynek Hermansky
The Journal of the Acoustical Society of America 128(6), 3769--3780, Acoustical Society of America, 2010

A multistream multiresolution framework for phoneme recognition.
Nima Mesgarani, Samuel Thomas, Hynek Hermansky
INTERSPEECH, pp. 318--321, 2010

A novel estimation of feature-space MLLR for full-covariance models
Arnab Ghoshal, Daniel Povey, Mohit Agarwal, Pinar Akyazi, Lukas Burget, Kai Feng, Ondrej Glembek, Nagendra Goel, Martin Karafi\'at, Ariya Rastrow, others
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pp. 4310--4313

Approaches to automatic lexicon learning with limited training examples
Nagendra Goel, Samuel Thomas, Mohit Agarwal, Pinar Akyazi, Lukas Burget, Kai Feng, Arnab Ghoshal, Ondrej Glembek, Martin Karafi\'at, Daniel Povey, others
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pp. 5094--5097

Cross-lingual and multi-stream posterior features for low resource LVCSR systems.
Samuel Thomas, Sriram Ganapathy, Hynek Hermansky
INTERSPEECH, pp. 877--880, 2010

Multilingual acoustic modeling for speech recognition based on subspace Gaussian mixture models
Lukas Burget, Petr Schwarz, Mohit Agarwal, Pinar Akyazi, Kai Feng, Arnab Ghoshal, Ondrej Glembek, Nagendra Goel, Martin Karafi\'at, Daniel Povey, others
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pp. 4334--4337

Subspace Gaussian mixture models for speech recognition
Daniel Povey, Lukas Burget, Mohit Agarwal, Pinar Akyazi, Kai Feng, Arnab Ghoshal, Ondrej Glembek, Nagendra K Goel, Martin Karafi\'at, Ariya Rastrow, others
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pp. 4330--4333


2009

Applications of signal analysis using autoregressive models for amplitude modulation
Sriram Ganapathy, Samuel Thomas, Petr Motlicek, Hynek Hermansky
Applications of Signal Processing to Audio and Acoustics, 2009. WASPAA'09. IEEE Workshop on, pp. 341--344

Tandem representations of spectral envelope and modulation frequency features for ASR.
Samuel Thomas, Sriram Ganapathy, Hynek Hermansky
INTERSPEECH, pp. 2955--2958, 2009

Temporal envelope subtraction for robust speech recognition using modulation spectrum
Sriram Ganapathy, Samuel Thomas, Hynek Hermansky
Automatic Speech Recognition \& Understanding, 2009. ASRU 2009. IEEE Workshop on, pp. 164--169

Static and dynamic modulation spectrum for speech recognition.
Sriram Ganapathy, Samuel Thomas, Hynek Hermansky
INTERSPEECH, pp. 2823--2826, 2009

Phoneme recognition using spectral envelope and modulation frequency features
Samuel Thomas, Sriram Ganapathy, Hynek Hermansky
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on, pp. 4453--4456

Modulation frequency features for phoneme recognition in noisy speech
Sriram Ganapathy, Samuel Thomas, Hynek Hermansky
The Journal of the Acoustical Society of America 125(1), EL8--EL12, Acoustical Society of America, 2009


2008

Front-end for far-field speech recognition based on frequency domain linear prediction
Ganapathy, Sriram and Thomas, Samuel and Hermansky, Hynek
Interspeech 2008
Abstract

Hilbert envelope based spectro-temporal features for phoneme recognition in telephone speech
Thomas, Samuel and Ganapathy, Sriram and Hermansky, Hynek
Interspeech 2008
Abstract

Spectro-temporal features for automatic speech recognition using linear prediction in spectral domain
Samuel Thomas, Sriram Ganapathy, Hynek Hermansky
Proceedings of the 16th European Signal Processing Conference (EUSIPCO 2008), Lausanne, Switzerland

Front-end for far-field speech recognition based on frequency domain linear prediction.
Sriram Ganapathy, Samuel Thomas, Hynek Hermansky
INTERSPEECH, pp. 984--987, 2008

Hilbert envelope based features for far-field speech recognition
Samuel Thomas, Sriram Ganapathy, Hynek Hermansky
Machine Learning for Multimodal Interaction, pp. 119--124, Springer, 2008

Hilbert envelope based spectro-temporal features for phoneme recognition in telephone speech.
Samuel Thomas, Sriram Ganapathy, Hynek Hermansky
INTERSPEECH, pp. 1521--1524, 2008

Recognition of reverberant speech using frequency domain linear prediction
Samuel Thomas, Sriram Ganapathy, Hynek Hermansky
Signal Processing Letters, IEEE15, 681--684, IEEE, 2008


2007

Language Identification of Person Names using CF-IOF based Weighing Function
S Thomas, A Verma
Interspeech, 2007

SSML Extensions for Indian Languages
Samuel Thomas, Ashish Verma, Nitendra Rajput
W3C Workshop, 2007

Natural sounding text-to-speech synthesis based on syllable-like units
Samuel Thomas
2007 - Master's thesis, Indian Institute of ...


2006

Natural sounding speech based on syllable-like units
Samuel Thomas, M Nageshwar Rao, Hema A Murthy, CS Ramalingam
EUSIPCO, Florence, Italy, 2006


2005

Distributed Text to Speech Synthesis for Embedded Systems--An analysis
Samuel Thomas, Hema A Murthy, C Chandra Sekhar
Proceedings of the Eleventh National Conference on Communications: NCC-2005, 28-30 January, 2005, pp. 273

Text-to-Speech Synthesis using syllable-like units
M Nageshwara Rao, Samuel Thomas, T Nagarajan, Hema A Murthy
Proceedings of National Conference on Communications, IIT, India, pp. 277--280, 2005


Year Unknown

Color-Theoretic Experiments to Understand Unequal Gender Classification Accuracy from Face Images
Vidya Muthukumar, Tejaswini Pedapati, Nalini Ratha, Prasanna Sattigeri, Chai-Wah Wu, Brian Kingsbury, Abhishek Kumar, Samuel Thomas, Aleksandra Mojsilovic, Kush R Varshney
0