ICDM 2019 Tutorial on Table Extraction and Understanding for Scientific and Enterprise Applications - overview

Authors: Douglas Burdick, Alexandre V Evfimievski, Yannis Katsis, Marina Danilevsky, and Nancy Wang
Location: ICDM Beijing, China, China National Convention Center (CNCC) - A301AB CNCC
Time: 2019/11/11, Monday, 13:30 – 16:00
Motivation: Valuable high-precision data are often published in the form of tables in both scientific and business documents. While humans can easily identify, interpret and contextualize tables, developing general-purpose automated techniques for extraction of information from tables is difficult due to the wide variety of table formats employed across different corpora. To extract useful data from tables, data cells must be correctly extracted and linked to all relevant headers, units of measure and in-text references. Table extraction involves identifying the border and the cell structure for each document table, while table understanding provides context by linking cells with semantic information inside and outside the table, such as row and column headers, footnotes, titles, and references in surrounding text. The objective of this tutorial is to provide a detailed overview of existing approaches for table extraction and understanding, highlight open research problems, and provide an overview of potential applications enabled by advanced table processing.
13:30 - 13:50 Introduction and applications
13:50 - 14:00 Demonstration of our end to end systems
14:00 - 14:50 Table extraction
14:50 - 15:00 Break and Questions
15:00 - 15:50 Table understanding
15:50 - 16:00 Conclusion and Questions