Vittorio Castelli  Vittorio Castelli photo         

contact information

Principal Research Staff Member
Thomas J. Watson Research Center, Yorktown Heights, NY USA


Professional Associations

Professional Associations:  American Statistical Association  |  IEEE   |  Sigma Xi


My current work is in the Natural Language Processing area, and focuses on machine-learning algorithms for information extraction from text and on multilingual question answering.

I belong to the Statistical Multilingual Information Extraction group of the Multilingual NLP Technologies department, part of IBM Research AI.

I am doing research in the Cognitive Computing area, specifically on multilingual question answering, as a collaboration with the IBM Watson Group.  Most recently, I have worked on answering questions on legal data for an IBM customer.

My theoretical work is on properties of active learning algorithms, specifically on the optimal number of labeled samples as a function of the number of unlabeled samples under broad parametric assumptions.

I mentor co-workers on developing inventive ideas into patentable inventions and I serve a the lead of a technical expert board in IBM internal evaluation of invention disclosures in the area of Natural Language Processing, and formerly in the general area of Cognitive Computing.  The latter board has been split into multiple boards alinged with various AI disciplines.

My past work in my department was mostly focused on government projects:

I was the technical lead and principal architect for the ENEX project, which developed a system that enables users to search and browse a corpus of technical or news documents in terms of entities and and their relations.

I was the technical lead for the DELPHI consortium team that participated to the BOLT IR task in the DARPA BOLT program.  The team includes IBM as the primary and Columbia, UMASS, UMD, and Stanford as partners.  I am also the architect of the DELPHI IR system.

I worked on algorithms for the DARPA GALE Distillation task (precursor to the BOLT IR task), and in the last two years of the program I was the principal architect of the distillation system for the Rosetta consortium, lead by IBM.

My previous work at IBM has been in areas including intelligent user interfaces, autonomic computing, memory compression, statistical pattern recognition, image digital libraries, data mining, and multidimensional indexing structures.

In my spare time, I have taught Information Theory, as well as Statistical Pattern Recognition at Columbia University, through the EE department.

I served as the Watson chair of the Natural Language Processing Professional Interest Community, I am a member of the board that evaluates IBM Corporate Technical Awards (the highest technical recognition within IBM), I serve on the IBM T.J. Watson Culture Club, and I server as instructor for the Family Science Saturday program (an outreach program sponsored by the IBM T.J. Watson Research Center for fourth and fifth grade students), where I also coordinate one of the courses.