SystemT - Educators

SystemT is available to teachers and students under an academic license.

Interested in using SystemT for academic research, or teaching a class on Information Extraction?

Contact us! Yunyao Li (yunyaoli {at} us{.}ibm{.}com)


A full senior undergraduate/graduate-level course on information extraction and SystemT is now available online here: [Text Analytics with SystemT]

The class has been adapted and taught in several universities at the graduate level:

Lectures on SystemT and the SystemT tutorial have also been used in existing courses:



  • Shimei Pan, May 2016: "The Information Extraction with SystemT class, which was offered for the first time at UMBC, has been a wonderful experience for me and my students.  I really enjoyed teaching this course. Students were also very enthusiastic. Based on the feedback from my students, they have learned a lot. Some of my students even want me to offer this class every semester." 
  • Alon Halevy, November 2015: "The tutorial of System T was an important hands-on component of a Ph.D course on Structured Data on the Web at the University of Aalborg in Denmark taught by Alon Halevy. The tutorial did a great job giving the students the feeling for the challenges involved in extracting structured data from text."























2020 - IBM Research Accomplishment Award: "Research Contributions to Watson NLP Stack"

2020 - IBM Research Accomplishment Award: "Deep Thinking Question Answering"

2020 - IBM Special Division Accomplishment Award: "Research Contributions to the IBM COVID-19 Technology Taskforce"

2020 - ISWC Best Poster/Demo Award

2019 - IBM Research A-Level Accomplishment Award: "Expanded Shallow Semantic Parsing and its Transfer to Watson Products"

2019 - IBM Research A-Level Accomplishment Award: "Research Contributions to Document Understanding (Document Conversion, Compare and Table Understanding)"

2019 - IBM Research A-Level Accomplishment Award: "IBM Services Solution Advisor and Cognitive Document Risk Analyzer"

2019 - AKBC Best Application Paper Award

2018 - NAACL Test-of-Time Award

2014 - IBM Corporate Award

2013 - IBM Research Outstanding Technical Accomplishment Award

2008, 2010, 2013 - IBM Research A-Level Accomplishment Award

Recent News


Our recent work on HEIDL is covered by Inside Big Data and it talks about how we are exploring ways for AI to make tedious tasks like contract review easier, faster, and more accurate. Read it here


HEIDL is covered in this blog post, where IBM Research details several techniques that might improve natural language processing in the enterprise domain


Research paper "NormCo: Deep Disease Normalization for Biomedical Knowledge Base Construction" received the Best Application Paper Award at AKBC 2019


Yunyao gave a talk on Building Domain-Specific Knowledge with Human in the Loop at Robust Machine Learning Algorithms and Systems: Detection & Mitigation of Adversarial Attacks and Anomalies Workshop, National Academies


Yunyao gave a talk on Building Domain-Specific Knowledge with Human in the Loop at University of Michigan AI Lab


Research paper "DIMSIM: An Accurate Chinese Phonetic Similarity Algorithm based on Learned High Dimensional Encoding" is accepted at CONLL 2018 (IBM Research Blog Post).


Research paper "Exploiting Structure in Representation of Named Entities using Active Learning" is accepted at COLING 2018.


Officially joined NSF Center for Big Learning as an Industry Partner.


Demoed LUSTRE an interactive system for entity understanding and standardization at ICDE 2018


Hosted Stanford professor Mark Musen's visit to IBM Research - Almaden


Industry track paper on the design and implementation of SystemT is accepted at NAACL-HLT 2018 Industry Track (the very first industry track at a major NLP conference).


Hosted Univ. of Washington professor Luke Zettlemoyer's visit to IBM Research - Almaden


Yunyao is co-chairing the very first NAACL-HLT Industry Track


Demo paper on Creating and Interacting with Large-Scale Domain-Specific Knowledge Bases is presented at VLDB 2017 [video] [poster]


Research paper on Distant Meta-Path Similarities for Text-Based Heterogeneous Information Networks is accepted at CIKM 2017


Research paper on Crowd-in-the-loop: A Hybrid Approach for Annotating Semantic Roles is accepted at EMNLP 2017


Hosted Stanford professor Dan Jurafsky's visit to IBM Research - Almaden


Research paper on Hardware Compilation Framework for Text Analytics Queries is accepted to Journal of Parallel and Distributed Computing (JPDC)


SEER, a system on learning extractors from examples, presented at CHI and SIGMOD 2017 [video] [paper]


Workshop paper on understanding relationships in the financial domain presented at DSMM 2017 [paper]