## Natural Language Processing PIC at IBM Research AI - Seminal Contributions

IBM Research has made seminal contributions to several areas of NLP, including question answering, the creation of metrics for automatic evaluation of Machine Translation system performance, the development of Conditional Maximum Entropy Models for Natural Language Modeling, building the foundations for Statistical Machine Translation, and pioneering the use of statistical methods for generating grammars.

### The IBM Jeopardy! Challenge

On February 14, 2011 IBM's "Watson" question answering system took on Jeopardy! grand champions Ken Jennings and Brad Rutter in a publicly televised game of Jeopardy! Watson won the grand challenge, and demonstrated an unprecedented level of automated question answering performance.

For more details, see: http://www.ibmwatson.com

*Main Citations:*

D. Ferrucci. Introduction to “This is Watson”. *IBM Journal of Research and Development*, 56(3-4), pp 1:1--1:15. 2012.

D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager, N. Schlaefer, C. Welty. Building Watson: An overview of the DeepQA project.
*AI Magazine*, 31(3), pp 59--79. 2010.

### Metrics for Evaluating Machine Translation

Automatic evaluation of MT system outputs has proven essential for assessing advances in the field. IBM paved the way for the creation of such metrics. The BLUE metric is the standard against which new metrics are evaluated.

*Main Citation:*

K. Papineni, S. Roukos, T. Ward, and W.J. Zhu, "BLEU: a method for automatic evaluation of machine translation," *Proc. ACL'02*, pp 311-318, 2002.

### Conditional Maximum Entropy for Natural Language Modeling

IBM developed the Maximum Entropy framework to tackle different NLP applications, and showed how statistical machine learning can successfully compete in NLP with more traditional linguistics and AI approaches.

*Main Citations:*

A.L. Berger, V.J. Della Pietra, and S.A. Della Pietra, "A maximum entropy approach to natural language processing," J. Comput. Linguistics, 22(1):39-71, 1996.

S.A. Della Pietra, V.J. Della Pietra, J. Lafferty, "Inducing Features of Random Fields," IEEE Trans. Pattern Anal. And Machine Intell., 19(4):380-393, 1997

### Statistical Machine Translation

IBM successfully demonstrated the feasibility of statistical machine translation, while at the same time establishing a mathematical framework that is still a fundamental reference in the field. In particular, IBM was at the forefront in establishing that automated systems can be trained to translate text between different languages.

*Main Citations:*

P.F. Brown, V.J. Della Pietra, S.A. Della Pietra, and R.L. Mercer "The mathematics of statistical machine translation: parameter estimation," J. Comput. Linguistics, 19(2):263-311, 1993.

P.F. Brown, J. Cocke, S.A. Della Pietra, V.J. Della Pietra, F. Jelinek, J.D. Lafferty, R.L. Mercer and P.S. Roossin, "A Statistical Approach to Machine Translation," J. Comput. Linguistics, 16(2):79-85, 1990.

P.F. Brown, J. Cocke, S.A. Della Petra, V.J. Della Pietra, F. Jelinek, R.L. Mercer, and P.S. Roossin, "A statistical approach to language translation," *Proc. 12th Conf. Comput. Linguistics*, 71-76, 1988.

A.L. Berger, P.F. Brown, S.A. Della Pietra, V.J. Della Pietra, J.R. Gillett, J.D. Lafferty, R.L. Mercer, H. Printz, and L. Ures, "The Candide system for machine translation," *Proc. Work. On Human Language Technology*, pp. 8-11, 1994.

### Statistical Parsing

IBM Research pioneered the use of statistical methods for automatically generating grammars from examples.

*Main Citations:*

R.A. Sharman, and F. Jelinek, and R. Mercer, "Generating a grammar for statistical training," *Proc. Work. on Speech and Natural Language*, pp. 267-274, 1990.

E. Black, F. Jelinek, J. Lafferty, D.M. Magerman, R. Mercer, S. Roukos, "Towards history-based grammars: using richer models for probabilistic parsing," *Proc ACL'93*, pp. 31-37, 1993.

T. Fujisaki, F. Jelinek, J. Cocke, E. Black, and T. Nishino, "A probabilistic parsing method for sentence disambiguation," Current Issues in Parsing Technology 139-152, 1991.

E. Black, R. Garside, and G.N. Leech, "Statistically-driven Computer Grammars of English: the IBM/Lancaster approach", Rodopi ed. Amsterdam, 1993.

A. Ratnaparkhi, J. Reynar, and S. Roukos, "A maximum entropy model for prepositional phrase attachment," *Proceedings of the ARPA Workshop on Human Language Technology*, 250--255, 1994.