Knowledge Discovery and Data Mining       


Knowledge Discovery and Data Mining - TDA Day 2018

Topological Data Analysis Day 2018

September 17, 2018

Full day of on-site lectures at the IBM T. J. Watson Research Center, Room 20-043

Event details for IBMers (calendar invite, live stream):

Questions to speakers? Slack your questions to
***New link for "Applications of TDA to Computational Genomics and Machine Learning"


9:00 - 9:15 Welcome
9:15 - 10:30 Homological Tools for Data, Robert Ghrist, University of Pennsylvania
10:30 - 10:45 Coffee Break
10:45 - 12:00 Topological Summaries for Machine Learning and Statistical Inference,
Sayan Mukherjee, Duke University
12:00 - 2:00 Lunch
2:00 - 3:15 Applications of TDA to Computational Genomics and Machine Learning, Aldo Guzman Saenz & Karthikeyan Natesan Ramamurthy, IBM Research
3:15 - 3:30 Coffee Break
3:30 - 4:45 Processing Heterogeneous Signals using Sheaf-based Filters, Michael Robinson, American University
4:45 - 5:00 Closing remarks

Invited Speakers:

Robert Ghrist

Title: Homological Tools for Data

Abstract: The past fifteen years has witnessed a dramatic burst of applications of topological thinking and theorems in the applied sciences, ranging from statistics to sensor networks, neuroscience, and more, to be surveyed here. This talk, meant for newcomers to the field, will introduce homology and persistent homology, and demonstrate why these are exciting and practical new tools in applied mathematics.

Biography: Robert Ghrist (Ph.D., Cornell, Applied Mathematics, 1995) is the Andrea Mitchell PIK Professor of Mathematics and Electrical & Systems Engineering at the University of Pennsylvania. He is a recognized leader in the field of Applied Algebraic Topology, with publications detailing topological methods for sensor networks, robotics, signal processing, tracking, neuroscience, and more. He is the author of a leading textbook on the subject (Elementary Applied Topology, 2014) and is the winner of the Chauvenet Prize, the highest award given for mathematical expository writing.

His prior work in leading the DARPA DSO SToMP project and participating in several DoD MURIs is complemented by NSF CAREER, NSF PECASE, SciAm50, and Vannevar Bush Faculty Fellow awards.

Ghrist is also a dedicated expositor and communicator of Mathematics, with teaching awards that include the MAA James Crawford Prize, Penn's Lindback Award, and the S. Reid Warren award in Engineering at Penn. Ghrist is the author and animator of popular videos for calculus, featured in the New York Times, BoingBoing, and Gizmodo.

Sayan Mukherjee

Title: Topological Summaries for Machine Learning and Statistical Inference

Abstract: In this talk we will discuss how topological summaries of data can be used both in machine learning algorithms as well as statistical models. We will revisit the ideas of summary and sufficient statistics, develop a notion of random persistence diagrams or random Euler curves and means and variances on these objects. We will then discuss how to use these topological summaries for data analysis. A discussion of what properties are desirable for a topological summary will be explored. Applications to modeling shapes and networks will be discussed. 

Biography: Sayan Mukherjee is a Professor of Statistical Science, Mathematics, Computer Science, and Biostatistics & Bioinformatics at Duke University. Sayan received his PhD from MIT and was a postdoc at the Broad Institute. He is a Fellow of the Institute of Mathematical Statistics.His research covers Bayesian methodology; computational and statistical methods in statistical genetics, quantitative genetics, cancer biology, and morphology; discrete Hodge theory, geometry and topology in statistical inference; inference in dynamical systems; machine learning; and stochastic topology

Michael Robinson

Title: Processing Heterogeneous Signals using Sheaf-based Filters

Abstract: Using the geometry of sheaves as the common language, this talk will bridge three separate areas: topology, geometry, and statistical signal processing.  Because sheaves model consistency relationships between local data, they are easily assembled from detailed models of systems.  Being topological in nature, sheaves mediate local-to-global inference.  By incorporating local geometry from the start, the global "fit" between local data and models can be quantified, which supports robust inferences about missing, conflicting, or inaccurate data.  The utility of this approach is not merely its intellectual cohesion; it also yields performant algorithms.  The talk will demonstrate these algorithms on example datasets from signal processing and data fusion.

Biography: Prof. Robinson is an associate professor in American University's Department of Mathematics and statistcs. He is an expert on topological signal processing and has led numerous research projects on the applications of topology to signal processing and remote sensing. He earned a Bachelor's degree in Electrical Engineering (2002) and a Master's degree in Mathematics (2003) from Rensselaer Polytechnic Institute. From that time, he has worked on projects involving radio propagation and network planning, bistatic radar processing, and advanced radar simulation. In 2008, he earned a Ph.D. in Applied Mathematics at Cornell University in which he developed topological methods for studying the dynamics of parabolic equations.

Prof. Robinson's work is characterized by an aggressive application of sophisticated mathematical techniques to detailed, practical models of systems. He is the lead developer of the open-source PySheaf topological analytic toolkit. Additionally, Prof. Robinson has found many new examples of practical, theoretically-motivated algorithms, and has demonstrated them on laboratory systems of his own design. Prof. Robinson maintains active relationships to industry, through which he regularly transitions concepts into prototype systems.





Yada Zhu

KDD PIC is proud to support

ML Symposium NYAS 2019

This symposium, the thirteenth in an ongoing series presented by the Machine Learning Discussion Group at the New York Academy of Sciences, will feature Keynote Presentations from leading scientists in both applied and theoretical Machine Learning and Spotlight Talks, a series of short, early career investigator presentations across a variety of topics at the frontier of Machine Learning.

CIKM 2018

The 27th ACM International Conference on Information and Knowledge Management takes place on October 22 - 26, 2018 at 'Lingotto', Turin, Italy. The theme for 2018 is "From Big Data and Big Information to Big Knowledge".


The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases will take place in the Croke Park Conference Centre, Dublin, Ireland during the 10 – 14 September 2018.

COLT 2018

The 31st edition of the Conference on Learning Theory will take place at KTH Royal Institute of Technology, Stockholm, Sweden, July 5 - 9, 2018.