Center for Computational Health - Research Areas


Patient Similarity Analytics

The ability to assess patient similarity by incorporating all attributes (and therefore potential confounding factors) about patients is important for both clinical decision support and comparative effectiveness studies. We have conducted extensive research in applying advanced machine learning methods such as supervised metric learning to develop patient similarity analysis algorithms. Our early work demonstrated the effectiveness of this approach in decision support for near term prognostics in ICU settings. We later expanded the methodology to allow iterative adjustments of the similarity metric by incorporating domain experts' input. Most recently, we combined this approach with scalable predictive modeling methodologies to design an approach for developing personalized predictive models capable of identifying patient level rankings of risk factors, leading to more actionable insights for personalized care management.

Translational Informatics

Our Drug Safety Platform allows pharmaceutical researchers to quickly identify drugs that have similar chemical and other characteristics to drugs associated with adverse Drug reactions (ADRs), providing tools that support Drug Safety efforts at multiple stages of the drug pipeline from initial research to post-market surveillance.  These tools depend on highly complex, but fundamental, methodologies, including matrix factorization for network analysis incorporating multiple sources of information.

  1. Drug Safety - Adverse drug reaction (ADR) is one of the major causes of failure in drug development. Accurate prediction and identification of potential ADRs are required throughout the life cycle of a drug, including early stages of drug design, multiple phases of clinical trials and post-market surveillance. IBM Research’s Drug Safety Tool addresses issues along the continuum of drug safety evaluation by delivering an integrated system for scalable, holistic, pretesting identification of potential ADRs based on the chemical properties and other characteristics of a compound, as well as assessing likely drug-drug interactions (DDIs). The tool also allows for enhanced post-market surveillance based on population-level real-world experience with prescribing the drug. 
    More specifically, the tool consists of a suite of analytics producing drug-ADR causal hypotheses for pharmaceutical researchers. To get to these causal relationships, we analyze 2D and 3D chemical structures of drugs to establish the chemical structure of active ingredient subcomponents that correlate with adverse reactions. We also leverage observational data sources, including spontaneous reporting systems, electronic health records and claims databases. These analytical tools depend on highly complex, but fundamental, methodologies, including matrix factorization for network analysis incorporating multiple sources of information, deep learning with attention mechanisms and causal inference.

Predictive and Disease Progression Modelling

For many diseases, the rate of progression from an asymptomatic, at-risk state to symptomatic disease, from earlier to more severe stages of disease, as well as the development of secondary conditions, is highly variable.  Improving a clinician's ability to identify which patients will progress at different rates is fundamental to defining and delivering customized care plans to patients to assure the best possible outcomes.  We have applied this type of modeling to multiple diseases and use-cases, some examples are described below:

  1. Heart Failure affects more than 5 million Americans, is a contributing cause in 1 of every 9 deaths and currently costs well over $30 billion a year to treat in the United States. Early detection of heart failure can lead to significant reductions in morbidity, mortality and total costs. IBM has applied their analytic resources to this problem in a project combining knowledge-driven and data-driven risk factors, as well as physician behavioral variables, to do predictive modelling around early detection and characterization of heart failure. Among other advanced techniques, this project utilizes Natural Language Processing (NLP) to extract data from clinical notes to improve our capture and understanding of relevant risk factors and symptoms to improve the accuracy and applicability of our models.
  2. Huntington's Disease is a serious genetic disease that progresses through different stages at varying rates and ages in different patients. There is currently no cure or effective treatment for Huntington's, although research continues on both, with particular emphasis on treatments that can help to minimize symptoms. IBM is working with the CHDI Foundation to apply big data analytics to the task of improving our understanding of this disease, particularly in the areas of elucidating disease progression models and medication efficacy to improve the lives of people with this condition.
  3. Comprehensive Risk Assessment. Disease progression trajectories are influenced by a multitude of different factors, many of which may not be well understood. As a result, defining optimal care pathways that address the similarities and the differences among groups of patients is a major challenge. Multi-task learning enables researchers to leverage both classification and prediction algorithms to create comprehensive risk assessments based on integrating the vast array of relevant data available - such as clinical, social, behavioral, genomic, quality of life scores, etc. to understand the full picture of a patient's status and needs. This approach has been applied to care management for elderly people, and is also being leveraged for drug safety use cases.

Visual Analytics and Cognitive Decision Support

Researchers need tools that enable them to dramatically accelerate the process of exploring and mining their data to derive new insights that can be translated into therapeutics and processes. The Visualization Workbench is a truly innovative visual analytics platform that does just that by enabling users to explore their data in ways that have not previously been possible with data of the scale and complexity to which it is being applied. The Workbench includes the Predictive Model Prospector that allows users to explore their data to understand risk factors and other features of the predictive model that may be critical in developing appropriate interventions. The Workbench also includes the "Clustrophile" tool that allows the users to group patients with similar characteristics to compare them to the overall models, as well as to derive more personalized models that will be more applicable to their specific needs.

Contextual & Behavioral Modeling

  1. Integrating Personal Routines and Population Patterns to Re-engage Fit-Bit Use - Fit-Bit usage soared when the device was first introduced, but statistics have shown that 50% of users of such devices stop using them within 6 months and most have stopped paying attention to them long before that. Since maintaining appropriate activity levels is important to long-term health, we have been conducting qualitative research that will allow us to identify effective strategies for using technology to engage people with their fitness.
  2. Cognitive Coaching Service for Fitness - Personal trainers are a luxury that most people cannot afford, but even the best personal trainers often won't fully understand the needs of every individual. By combining real-time data from wearable devices and self-reported activity as well as nutritional data, personalized coaching can be made more accessible. Users can obtain help with setting goals, defining fitness strategies and fitting day-to-day fitness activities into their busy schedules.