Ashish Verma  Ashish Verma photo       

contact information

Senior Manager - Cognitive Computing Platform & Infrastructure
IBM Research, New Delhi, India


Professional Associations

Professional Associations:  ACM Distinguished Speaker   |  IEEE, Senior Member

more information

More information:  IBM Research - India

Voice of Customer Analytics

VoCA is hosted service offering in the area of CRM analytics. It provides unique capability to discover actionable business insights across various data sources ranging from calls to unstructured data to structured data. It encompasses advanced data mining algorithms such as data linking, text clustering, text annotation, sentiment mining and predictive modeling, that allows analysts to come up with actionable insights regarding customer churn, first call resolution, and key customer satisfaction / dissatisfaction drivers. It can analyze a variety of heterogenous data sources such as, agent logs, call records, CSAT survey verbatim, other enterprise logs, CRM records and so on. It has been shown to cut down the business analysts time to less than half to discover actionable business insights by using this platform.

Conversation Mining

In this project, we analyze the conversations in a call center to understand various issues being discussed, the intent of the caller, issue resolution and such things. We use speech recognition, NLU, machine learning, topic modelling and other techniques to address the business problems.

Social Media Mining

Social networking websites have become more than just a place for interaction between various users. Enterprises are trying to leverage this data to identify the opportunities to improve their products, brand image and address customer concerns in addition to identifying new opportunities. We are using text analytics and data mining techniques to address these real-life problems.


Sensei is designed to address the voice-based assessment and training needs of call centers. This interactive, web based application combines IBM voice recognition software with a voice assessment methodology. It uses advanced speech processing techniques to evaluate spoken English skills, such as, pronunciation, grammar, syllable stress and comprehension. The system analyzes the voice files created when an individual responds to a prompt; it then evaluates the files against a benchmark created from a large sample of model speakers and returns an objective score.

Audio-Visual Speech Recognition

This project was under collaboration with IBM T. J. Watson Research Center, NY. Several visual features, to be used for audio-visual speech recognition, were investigated and their relative performance was measured by conducting phonetic classification and recognition experiments in noisy and clean surroundings. Several integration strategies to integrate audio and video channels were also investigated and there performance measured. I was awarded IBM Research Division Award for contribution towards this project.

Large vocabulary continuous Hindi speech recognition

In this project, a prototype was developed for large vocabulary (75000+ words) continuous speech recognition system for Hindi language. This included designing Hindi phone set, training acoustic models, training language model, building vocabulary etc. This prototype can recognize continuous Hindi speech with an accuracy of up to 95% word recognition rate. Currently, two pilots of this prototype are installed at different locations in India.

Voice fonts for speech personality representation and transformation

This was my Ph. D. work in which I was working on the implementation of voice fonts. Voice fonts will be used to represent and transform the personality of speech. Voice fonts can be used to make the speech sound in a particular speaker’s voice just like text fonts make the text appear in a particular style. I have worked on various descriptors of voice fonts like, spectral envelope, pitch, speaking rate etc. Various techniques to represent these descriptors and using them to transform the personality of speech have been proposed.