TAKMI - Text Analysis and Knowledge Mining     


Hiroshi Kanayama photo Tetsuya Nasukawa photo photo

TAKMI - Text Analysis and Knowledge Mining - overview

The aim of our text mining project is to research technologies to discover useful knowledge from enormous collections of documents, and to develop systems to present this knowledge and to support the users' decisions. Traditional data mining technologies mine knowledge from data structured with well-formed schemes such as relational tables. However, text data does not use such schemes, and the information is described freely within the documents. Therefore, we are focusing on Natural Language Processing(NLP) technologies to extract the information. Using NLP technologies, target documents are transformed into collections of concepts, which are described using terms discovered within the texts. Often, "text mining" is used to refer to a text search technique, but we think of text mining from a more functional perspective. Text mining technologies extract more information than just picking up keywords from texts: facts, authors' intentions, their expectations, and their claims. This knowledge is helpful for many applied tasks such as marketing, trend analysis, claim processing, or generating FAQs (frequently asked questions). In this project, we are currently working primarily on two text mining solutions. One solution is for CRM which extracts useful information from call center logs for marketing strategies and so on. The other solution is for life science, which helps find hidden knowledge from large numbers of biomedical documents.

Software Product

IBM Content Analytics:
Many of our technologies are now available as an IBM product, IBM Content Analytics.