Ying Li  Ying Li photo         

contact information

Principle Data Scientist, Manager
Cognitive People Solutions, IBM Human Resources, Armonk, NY


Professional Associations

Professional Associations:  IEEE

IBM CHQ Human Resources, August 2017 -- Present

Lead projects that develop advanced/cognitive people analytics and solutions to address various aspects of employee lifecycle, including talent acquisition, talent engagement, talent development, talent retention, talent growth and talent deployment.

1. Lead data scientist and cognitive architect on a Talent Acquisition project, aiming to identify passive candidates for open job requisitions based on their external social profiles with speed and quality

2. Lead data scientist of an enterprise program that uses select characteristics to identify employees most likely to leave the company and provides incentives to decrease the probability of attrition. Achieved a net benefit of $3M in 2019

3. Lead data scientist of a project aiming to assess the scarcity of skills in external market based on insights derived from both internal and external data. This skill scarcity index has now been widely adopted by our business units and become an important input factor to various enterprise programs related to employees’ skill planning and compensation investment. This work has won the White Paper Competition of the 2020 Wharton People Analytics Conference

4. Lead data scientist of an AI application which provides expert advice to managers on employee compensation decisions where the recommendations are personalized and evidence-based, and aligned with company’s skill-based pay strategy. The application provided recommendations for ~28K managers covering ~260K employees across 79 countries in 2019


IBM GBS, May 2016 -- July 2017

  • Led the cognitive analytics development for a leading mortgage company for the following two work streams: Point of sale/portfolio retention analysis and customer servicing analytics.
  • Led the cognitive solution design and development of the Language Translation project, funded by our account team, which translates text comments related to invoice processing from English to French/European Spanish/Latin American Spanish, and vice versa. Specifically, I lead the technical effort on developing, customizing and testing the translation models, overseeing the whole process from collecting the training data to measuring/monitoring the model performance, and providing guidance and direction on the ways to further improve translation accuracy. This solution has already been deployed to our client’s production environment, and the P2P (Procure to Pay) agents are using this web-based application for daily work.
  • Served as the technical advisor on following projects by providing technical guidance on cognitive solutioning design: 1) C3 (Cognitive Content Collator) which is a cloud-based application that tags, chunks, harvests, curates and collates learning content; 2) Cognitive Agent Assist, a pilot with a leading mortgate company, which develops a conversational engine to assist call-center agents to interact with users with faster response and better resolution.


IBM Research, 2013 -- 2016

Digital Marketing, 2015 -- 2016

Lead research on projects in digial marketing area to: 1) optimize the RTB (real-time bidding) performance on IBM’s paid media (specifically, online advertising) by identifying the optimal daily budget allocation, bidding price, and other bidding factors, using machine learning approaches; and 2) recommend personalized content to users who visit ibm.com pages by understanding and mining user behaviors on page browsing and web content visit.


Marketing Data Analysis, 2014 -- 2015

Led research on marketing data analysis, aiming to: 1) identify customers who have a higher likelihood to be converted to patrons, so as to help marketing team prioritize their direct calls to customers; and 2) recommend the right contents for marketing newsletters under the right context so as to attract more customers to be trial users, convert trial users to paid users, and continuously engage paid users. This is a joint work with IBM’s Bluemix Marketing Team.


Computational Creativity, 2014 -- 2015

Led research on extending and applying the principles of computational creativity into the following two areas: 1) food aesthetics assessment, enhancement and creation; and 2) creative color palettes generation for product packaging design, which are not only novel and visually appealing, but also consistent with the messages of particular brand and product. Our framework is built upon work in the area of color science, color psychology and statistics. This work results in 1 IEEE conference paper and 2 under submission. 


Large-scale Finance Data Analytics, 2013 -- 2014

This project aims to analyze a large-scale data set in the finance area to predict the performance of customers based on their behaviors such as their payment history, their credit scores, their collections, their auto loan and lease trades, their bank card trades, their mortgage trades, etc. Various predictive data modeling techniques are applied to achieve this goal. We have also applied text mining techiques to analyze the call log data to assist such performance prediction.


Computational Creativity, 2014 -- Present

Creativity is the pinnacle of human intelligence and the next frontier of cognitive systems. Our first demonstration of a computational creativity system has been in the domain of food where humans partner with the system to explore white spaces in the process of creating new recipes using as inspiration a large number of existing recipes. However, the principles of computational creativity that we have developed apply to other domains as well such as travel, fashion and product design.

Our current efforts have two fronts. Specifically, we are extending and applying the principles of computational creativity into the following two areas: 1) food aesthetics assessment, enhancement and creation; and 2) creative color palette generation for product packaging design.

In particular, our first research effort is about assessing the aesthetics of a cooked dish by analyzing its color composition. Especially, we detect major colors in a food image, explores their compliance to certain color schemes, measure how they spread across the color wheel, estimate their intensity and brightness, and finally assess how visually appealing the image is. Recommendations on adding particular colors to the image to increase its aesthetics will also be given. We have also developed a Google Glass App based on this engine. Images to be assessed can be either taken at real-time using Glass camera or selected from images pre-stored in its memory. Here is a peek to the ThinkLab Demo on this work -- Computational Aesthetics.

Our second effort aims at helping CPG companies develop innovative and visually appealing color palettes for product packaging that conveys the right message they are trying to communicate to the market through the product. Product package design is a complex problem as the design is influenced by many factors such as target customer segment, product characteristics, desired message, brand expectations, etc. In order to create a system that assists in product design computationally, we build upon existing research in multiple scientific disciplines including color science, color psychology, notion of Bayesian suprise in information theory and message ontology.

Here is a peek to the ThinkLab Demo on this work -- Cognitive Design


AMS Analytics Platform (AAP), 2012-2013

This project aims to build a platform that offers a suite of AMS analytics to help improve the productivity and quality of AMS practitioners. Analytics of this platform cover the following four major areas:

  • Incident Ticket Analytics
  • Investigative Analytics
  • Knowledge Management
  • Resource Management

The figure belows show an overview of the functionality of the platform, which ranges from incident diagnosis, investigation and resolution to continuous operations improvement and resource optimization.


AAP functionality overview


Smarter AMS (Application Management Service), 2011-2012

In AMS (Application Management Service) business, it is estimated that organizations can spend as much as 80% of their application budgets on maintenance-related activities. Such maintenance activities are faithfully captured by application-based problem tickets, which contain a wealth of information about application management processes such as how well an organization utilizes its resources and how well people are handling tickets. Consequently, analyzing ticket data becomes one of the most effective ways to gain insights on the quality of application management process and the efficiency and effectiveness of actions taken in the corrective maintenance. In order to provide insights and help improve the application management process, we have developed a set of operational analytics tools that are designed to mine ticket data and conduct various analyses including ticket effort estimation, resource utilization measurement, resource sharing and pooling, group right-sizing, and cross-skill recommendation.

The figure belows shows important business KPIs and operational metrics, as well as the analytics that are needed to support these KPIs and metrics.

AMS Analytics


 SAFEST: Situation Awareness for Enhanced Safer Transportation, 2010-2011

Partnering with a railroad company, we aim to detect abnormal spiking and anchor patterns, as well as identify missing, loose, and broken joint bolts on different classes of tracks in this project. We achieve this goal by automatically analyzing videos captured by multiple cameras (up to 6) mounted on a hi-rail vehicle.

Advanced Driver Assistance System, 2008-2009

Projects in this area aims to develop technologies that can assist drivers during the driving process. When designed with a safe human-machine interface, such system would increase car safety and more generally road safety.

Members: Ying Li, Norma Haas, Jonathan Connell, Charles Otto, and Sharath Pankanti

  • Traffic sign detection

    This project detects traffic signs such as speed sign, using machine vision technology. Specifically, by analyzing the video cpatured through a forward-facing camera mounted in a vehicle's wind-shield, we attempt to detect and recognize traffic signs, and subsequently prompt such information back to the driver for taking appropriate actions.

  • Forward car collision detection

    This project aims to detect other vehicles ahead of the host vehicle so as to avoid any potential car collisions. Machine learning techniques were applied to accomplish this task.

  • Intelligent headlight control

    This project aims to automatically control the headlights of automobiles using sensors such as camera. Specifically, by analyzing the videos captured at real-time, we attempt to develop a system that can automatically control the switch between high beam and low beam during nighttime drive by detecting headlight, taillight and street light, so as to increase the road safetly.

    The core part of the project lies in image and video content analysis and machine learning.

iScale, 2007

This project aims to simplify the deployment of a mediation flow graph onto a distributed network of broker machines of different ESB products. There are two major parts in this project: 1) to optimally place the mediation flow graph onto the overlay network of machines such that the average end-to-end latency is minimized; and 2) to automate the deployment of such placement to a mixture of heterogeneous ESB platforms.

SMILE: Smart Middleware Light Ends, 2006

This project develops a new type of messaging middleware system to simplify integration of distributed enterprise applications. Specifically, SMILE provides enterprise developers a higher level model for programming and managing distributed messaging systems - a way to specify a system in terms of what, but not how.

MAGIC: Metadata Automatic Generation for Instructional Content, 2005-2006

This project was funded by DHS (Department of Homeland Security), which analyzes the training and education materials across different DHS agencies, extracts semantic learning objects, and exports them into DHS repositories and learning management systems. These learning objects can then be shared, indexed and reused across different DHS agencies.

Insurance Auto Triage Project, 2004

This project aims at automatically detecting the automobile damage area by exploiting image processing and analysis techniques. The ultimate goal is to reduce the time and labor involved in insurance claims process. A prototype of this system is available at ISL, Hawthorne, NY.

Saras: e-Learning Content Management, 2003-2004

This project develops technologies to structuralize education and training videos for rapid access and smart browsing of learning content.