Speech Technology at IBM Research - Tokyo       

links

Speech Technology at IBM Research - Tokyo - Publication


Conference Papers

  • Gakuto Kurata, Bhuvana Ramabhadran, George Saon, Abhinav Sethy, "Language Modeling with Highway LSTM", in Proceedings of ASRU 2017, Okinawa, Japan
  • Gakuto Kurata, Abhinav Sethy, Bhuvana Ramabhadran, George Saon, "Empirical Exploration of Novel Architectures and Objectives for Language Models", in Proceedings of INTERSPEECH 2017, Stockholm, Sweden, August 2017
  • George Saon, Gakuto Kurata, Tom Sercu, Kartik Audhkhasi, Samuel Thomas, Dimitrios Dimitriadis, Xiaodong Cui, Bhuvana Ramabhadran, Michael Picheny, Lynn-Li Lim, Bergul Roomi, Phil Hall, "English Conversational Telephone Speech Recognition by Humans and Machines", in Proceedings of INTERSPEECH 2017, Stockholm, Sweden, August 2017
  • Takashi Fukuda, Masayuki Suzuki, Gakuto Kurata, Samuel Thomas, Jia Cui, Bhuvana Ramabhadran, “Efficient knowledge distillation from an ensemble of teachers”, in Proceedings of INTERSPEECH 2017, Stockholm, Sweden, August 2017
  • Michael Heck, Masayuki Suzuki, Takashi Fukuda, Gakuto Kurata, Satoshi Nakamura, “Ensemble of multi-scale VGG acoustic models”, in Proceedings of INTERSPEECH 2017, Stockholm, Sweden, August 2017
  • Masayuki Suzuki, Gakuto Kurata, Abhinav Sethy, Bhuvana Ramabhadran, Ken Church, Mark Drake, “Symbol sequence search from telephone conversation", in Proceedings of INTERSPEECH 2017, Stockholm, Sweden, August 2017
  • Osamu Ichikawa, Takashi Fukuda, Gakuto Kurata, Steven J. Rennie, “Factorial modeling for effective suppression of directional noise”, in Proceedings of INTERSPEECH 2017, Stockholm, Sweden, August 2017
  • Takashi FUKUDA, Osamu ICHIKAWA, Gakuto KURATA, Ryuki TACHIBANA, Samuel Thomas, Bhuvana Ramabhadran, "Effective Joint Training of Denoising Feature Space Transforms and Nueral Network Based Acoustic Models", in Proceedings of ICASSP 2017, March 2017
  • Osamu ICHIKAWA, Takashi FUKUDA, Masayuki SUZUKI, Gakuto KURATA, Bhuvana Ramabhadran, "Harmonic Feature Fusion for Robust Neural Network-based Acoustic Modeling", in Proceedings of ICASSP 2017, March 2017
  • Gakuto KURATA, Bing Xiang, Bowen Zhou, Mo Yu,"Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling", in Proceedings of EMNLP 2016, Austin, U.S.A., November 2016
  • Gakuto KURATA, Brian Kingsbury, "Improved Neural Network Initialization by Grouping Context-Dependent Targets for Acoustic Modeling", in Proceedings of INTERSPEECH 2016, San Francisco, U.S.A., September 2016
  • Gakuto KURATA, Bing Xiang, Bowen Zhou, "Labeled Data Generation with Encoder-Decoder LSTM for Semantic Slot Filling", in Proceedings of INTERSPEECH 2016, San Francisco, U.S.A., September 2016
  • Masayuki Suzuki, Ryuki Tachibana, Samuel Thomas, Bhuvana Ramabhadran, George Saon, "Domain Adaptation of CNN based Acoustic Models under Limited Resource Settings", in Proceedings of INTERSPEECH 2016, September 2016
  • Gakuto KURATA, Bing Xiang, Bowen Zhou, "Improved Neural Network-based Multi-label Classification with Better Initialization Leveraging Label Co-occurrence", in Proceedings of NAACL/HLT 2016, San Diego, U.S.A., June 2016
  • Takashi FUKUDA, Osamu ICHIKAWA, and Ryuki TACHIBANA, "Convolutional Neural Network Pre-trained with Projection Matrices on Linear Discriminant Analysis",  in Proceedings of INTERSPEECH 2016, Shanghai, China, March 2016
  • Gakuto KURATA, Daniel Willett, "Deep Neural Network Training Emphasizing Central Frames", in Proceedings of INTERSPEECH 2015, Dresden, Germany, September 2015
  • Masayuki SUZUKI, Gakuto KURATA, Tohru NAGANO, Ryuki TACHIBANA, "Speech Recognition Robust Against Speech Overlapping in Monaural Recordings of Telephone Conversations", in Proceedings of ICASSP 2016, March 2016
  • Nobuyasu ITOH, Gakuto KURATA, Ryuki TACHIBANA, Masafumi NISHIMURA,  "A Metric for Evaluating Speech Recognizer Output Based on Human-perception Model", in Proceedings of INTERSPEECH 2015, September 2015
  • Takashi FUKUDA, Osamu ICHIKAWA, Masafumi NISHIMURA, Steven J. RENNIE, and Vaibhava GOEL, "Regularized Feature-space Discriminative Adaptation for Robust ASR", in Proceedings of INTERSPEECH 2014, Singapore, September 2014
  • Osamu ICHIKAWA, Steven J. RENNIE, Takashi FUKUDA, and Masafumi NISHIMURA, "Channel-mapping for speech corpus recycling", in Proceedings of ICASSP 2013, Vancouver, Canada, May 2013
  • Masayuki SUZUKI, Gakuto KURATA, Masafumi NISHIMURA, Nobuaki MINEMATSU, "Discriminative Reranking for LVCSR Leveraging Invariant Structure", in Proceedings of INTERSPEECH 2012, September 2012
  • Takashi FUKUDA, Ryuki TACHIBANA, Upendra CHAUDHARI, Bhuvana RAMABHADRAN, and Puming ZHAN, "Constructing Ensembles of Dissimilar Acoustic Models using Hidden Attributes of Training Data", in Proceedings of ICASSP 2012, Kyoto, Japan, March 2012
  • Osamu ICHIKAWA, Steven RENNIE, Takashi FUKUDA, and Masafumi NISHIMURA, "Model-based Noise Reduction Reveraging Frequency-wise Confidence Metric for In-car Speech Recognition", in Proceedings of ICASSP 2012, Kyoto, Japan, March 2012
  • Ryuki TACHIBANA, Takashi FUKUDA, Upendra CHAUDHARI, Bhuvana RAMABHADRAN, and Puming ZHAN, "Frame-level AnyBoost for LVCSR with the MMI Criterion", in Proceedings of ASRU 2011, Hawaii, USA, December 2011
  • Takashi FUKUDA, Osamu ICHIKAWA, and Masafumi NISHIMURA, "Combining Feature Space Discriminative Training with Long-term Spectro-temporal Features for Noise-robust Speech Recognition", in Proceedings of INTERSPEECH 2011, Florence, Italy, August 2011
  • Takashi FUKUDA, Osamu ICHIKAWA, and Masafumi NISHIMURA, "Breath-detection-based Telephony Speech Phrasing", in Proceedings of INTERSPEECH 2011, Florence, Italy, August 2011
  • Masayuki SUZUKI, Gakuto KURATA, Masafumi NISHIMURA, Nobuaki MINEMATSU, "Continuous Digits Recognition Leveraging Invariant Structure", in Proceedings of INTERSPEECH 2011, pp.993-996, Florence, Italy, August 2011
  • Gakuto KURATA, Nobuyasu ITOH, Masafumi NISHIMURA, "Acoustic Model Training with Detecting Transcription Errors in the Training Data", in Proceedings of INTERSPEECH 2011, pp.1689-1692, Florence, Italy, August 2011
  • Gakuto KURATA, Nobuyasu ITOH, Masafumi NISHIMURA, Abhinav Sethy, Bhuvana Ramabhadran, "Named Entity Recognition from Conversational Telephone Speech Leveraging Word Confusion Networks for Training and Recognition", in Proceedings of ICASSP 2011, pp.5576-5579, Prague, Czech Republic, May 2011
  • Gakuto KURATA, Nobuyasu ITOH, Masafumi NISHIMURA, "Training of Error-corrective Model for ASR without Using Audio Data", in Proceedings of ICASSP 2011, pp.5572-5575, Prague, Czech Republic, May 2011
  • Takashi FUKUDA, Osamu ICHIKAWA, and Masafumi NISHIMURA, "Improved Voice Activity Detection Using Static Harmonic Features", in Proceedings of ICASSP 2010, Dallas, Texas, USA, March 2010
  • Takashi FUKUDA, Osamu ICHIKAWA, and Masafumi NISHIMURA, "Improved Voice Activity Detection Using Static Harmonic Features", in Proceedings of ICASSP 2010, Dallas, Texas, USA, March 2010
  • Osamu ICHIKAWA, Takashi FUKUDA, and Masafumi NISHIMURA, "Dynamic Features in the Linear Domain for Robust Automatic Speech Recognition in a Reverberant Environment", in Proceedings of INTERSPEECH 2009, Brighton, U.K., September 2009
  • Gakuto KURATA, Nobuyasu ITOH, Masafumi NISHIMURA, "Acoustically Discriminative Training for Language Models", in Proceedings of ICASSP 2009, pp.4717-4720, Taipei, Taiwan, April 2009
  • Takashi FUKUDA, Osamu ICHIKAWA, and Masafumi NISHIMURA, "Short- and Long-term Dynamic Features for Robust Speech Recognition", in Proceedings of INTERSPEECH 2008, Brisbane, Australia, September 2008
  • Takashi FUKUDA, Osamu ICHIKAWA, and Masafumi NISHIMURA, "Phone-duration-dependent Long-term Dynamic Features for Stochastic Model-based Voice Activity Detection", in Proceedings of INTERSPEECH 2008, Brisbane, Australia, September 2008
  • Osamu ICHIKAWA, Takashi FUKUDA, and Masafumi NISHIMURA, "Local Peak Enhancement Combined with Noise Reduction Algorithms for Robust Automatic Speech Recognition in Automobiles", in Proceedings of ICASSP 2008, Las Vegas, Nevada, USA, April 2008
  • Ryuki TACHIBANA, Tohru NAGANO, Gakuto KURATA, Masafumi NISHIMURA, Noboru BABAGUCHI,  "Preliminary Experiments toward Automatic Generation of New TTS Voices from Recorded Speech Alone", in Proceedings of INTERSPEECH 2007, Antwerp, Belgium, August 2007
  • Gakuto KURATA, Shinsuke MORI, Nobuyasu ITOH, Masafumi NISHIMURA, "Unsupervised Lexicon Acquisition from Speech and Text", in Proceedings of ICASSP 2007, Vol.4, pp.421-424, Honolulu, U.S.A, April 2007
  • Shinsuke MORI, Daisuke TAKUMA, Gakuto KURATA, "Phoneme-to-Text Transcription System with an Infinite Vocabulary", in Proceedings of COLING-ACL 2006, Sydney, Australia, July 2006
  • Gakuto KURATA, Shinsuke MORI, Masafumi NISHIMURA, "Unsupervised Adaptation of a Stochastic Language Model Using a Japanese Raw Corpus", in Proceedings of ICASSP 2006, Vol.1, pp.1037-1040, Toulouse, France, May 2006
  • Shinsuke MORI, Gakuto KURATA, "Class-based Variable Memory Length Markov Model", in Proceedings of INTERSPEECH 2005, pp.13-16, Lisbon, Portugal, July 2005
  • Takashi FUKUDA and Tsuneo NITTA, "Designing Multiple Distinctive Phonetic Feature Extractors for Canonicalization by Using Clustering Technique", in Proceedings of INTERSPEECH 2005, Lisbon, Portugal, September 2005

Journal Papers