Watson for Infrastructure Management and Troubleshooting       

links

 Michael Heller photoARUN KUMAR photoSRIKANTH G TAMILSELVAM photo

Watson for Infrastructure Management and Troubleshooting - overview


IT infrastructure management is a significant investment area for enterprises across industries and geographies. Several organizations struggle to manage their IT in house due to shortage of skills and decreased IT budgets. Others end up outsourcing their IT management and make do with loss of control and varying levels of service.  We are extending and applying Cognitive Computing technologies (including IBM Watson)  to help make IT infrastructure more manageable through natural language interfaces. We are investigating both product and service offering aspects of these technologies. We are extending the core Watson like technologies and instilling IT domain specific knowledge and information access into systems with the aim of making product offerings. We are eating our own dog food by applying these technologies to the infrastructure management services businesses at IBM thereby significantly improving practitioner productivity.

Research Challenges
IBM Watson is much more than a search engine that won at the Jeopardy! game show. To make that technology work in specific business settings like IT infrastructure management, research is needed to develop specific new capabilities that require adding knowledge about IT infrastructure and its management. This technology can then be made available to end customers facing problems with IT infrastructure as well as can be made available as a reference knowledge management asset to support practitioners and agents. Specific to the domain of IT infrastructure and hardware/software/services, we need the ability to tie together what users type in when they express problems, with what they mean in terms of system aspects and then reason about how those problems can be solved. This reasoning process requires mining large amounts of disparate data like logs, ticket history, technical documents and system generated logs to detect anomalies for root cause analysis. Further, intuitive visualization of the results need to be created that lets the end user or a support agent to diagnose the problem and identify the remedy faster. We also need to make advances in the direction of root cause identification and analysis across disparate data sources.

The following are two specific projects in this area in the CSS department.

Watson for IT problem solving
Enterprise IT infrastructure and physical assets contain rich structured contextual information about their status, contents, and that of users, apps, tasks, etc. Such information is volatile, changes frequently, and cannot be ingested in search and retrieval systems of today. Access to such data and reasoning over it needs to be enhanced and exposed via easy-to-use interfaces that provide lower cognitive load than complex dashboards and coded data streams. We are building a Natural Language based system to augment existing Watson-like systems focusing on joint structured-unstructured information access, thereby enabling root cause analysis for infrastructure problem solving use cases. We are developing a logical framework to capture natural language questions in a formal representation that is back-end agnostic. Such a representation will facilitate (amongst other capabilities) API read/write queries to systems like SCCD using an IT Ontology (derived from CIM/ITIL) to glue together what (i) a user types, (ii) what she means in system terms, and (iii) how we retrieve information from underlying systems.

Watson for Technical Support Services
Improving productivity of practitioners is a critical goal for IT services organizations. Today support agents read numerous pieces of information including many searches, log files and case history from disparate data sources to suggest solutions to customer problems. Further they need to assimilate this information and apply quickly to resolve an issue and offer a solution. The experience and ability of agents varies thereby affecting the effectiveness of the entire process. The goal of this project is to build an IBM Watson technology based system to effectively and efficiently diagnose customer problems documented by call center agents and identify the best action plan to resolve them. This work involves some interesting and challenging problems in the areas of information retrieval, information extraction, machine learning and data management.