Contact Information

Vittorio Castelli
Principal Research Staff Member
Thomas J. Watson Research Center, Yorktown Heights, NY USA
      +1dash914dash945dash2396


My current work is in the Natural Language Processing area, and focuses on machine-learning algorithms for information extraction from text and on multilingual question answering.

I belong to the Statistical Multilingual Information Extraction group of the Multilingual NLP Technologies department.

I am doing research in the Cognitive Computing area, specifically on multilingual question answering, as a collaboration with the IBM Watson Group

My theoretical work is on properties of active learning algorithms, specifically on the optimal number of labeled samples as a function of the number of unlabeled samples under broad parametric assumptions.

I mentor co-workers on developing inventive ideas into patentable inventions and I serve as technical expert in IBM internal evaluation of invention disclosures in the area of Cognitive Computing.

My past work in my department was mostly focused on government projects:

I was the technical lead and principal architect for the ENEX project, which developed a system that enables users to search and browse a corpus of technical or news documents in terms of entities and and their relations.

I was the technical lead for the DELPHI consortium team that participated to the BOLT IR task in the DARPA BOLT program.  The team includes IBM as the primary and Columbia, UMASS, UMD, and Stanford as partners.  I am also the architect of the DELPHI IR system.

I worked on algorithms for the DARPA GALE Distillation task (precursor to the BOLT IR task), and in the last two years of the program I was the principal architect of the distillation system for the Rosetta consortium, lead by IBM.

My previous work at IBM has been in areas including intelligent user interfaces, autonomic computing, memory compression, statistical pattern recognition, image digital libraries, data mining, and multidimensional indexing structures.

In my spare time, I have taught Information Theory, as well as Statistical Pattern Recognition at Columbia University, through the EE department.

I served as the Watson chair of the Natural Language Processing Professional Interest Community, I am a member of the board that evaluates IBM Corporate Technical Awards (the highest technical recognition within IBM), I serve on the IBM T.J. Watson Culture Club, and I volunteer as instructor for the Family Science Saturday program (an outreach program sponsored by the IBM T.J. Watson Research Center for fourth and fifth grade students).