Projects and Groups
- Computer Science
- Algorithms and Theory
- Computational Biology
- Healthcare Informatics
- Operations Research
- Performance Modeling and Analysis
- Signal Processing
My current work is in the Natural Language Processing area, and focuses on machine-learning algorithms for information extraction from text and on multilingual question answering.
I belong to the Statistical Multilingual Information Extraction group of the Multilingual NLP Technologies department.
I am doing research in the Cognitive Computing area, specifically on multilingual question answering, as a collaboration with the IBM Watson Group.
I am the technical lead and principal architect for the ENEX project, which developed a system that enables users to search and browse a corpus of technical or news documents in terms of entities and and their relations.
I was the technical lead for the DELPHI consortium team that participated to the BOLT IR task in the DARPA BOLT program. The team includes IBM as the primary and Columbia, UMASS, UMD, and Stanford as partners. I am also the architect of the DELPHI IR system.
I worked on algorithms for the DARPA GALE Distillation task (precursor to the BOLT IR task), and in the last two years of the program I was the principal architect of the distillation system for the Rosetta consortium, lead by IBM.
I served as the Watson chair of the Natural Language Processing Professional Interest Community.
My previous work at IBM has been in areas including intelligent user interfaces, autonomic computing, memory compression, statistical pattern recognition, image digital libraries, data mining, and multidimensional indexing structures.
In my spare time, I have taught Information Theory, as well as Statistical Pattern Recognition at Columbia University, through the EE department.