Since Nov 2009, I am a member of the HLT group at IBM Research, India (Bangalore). My topic of interest spans several areas like information retrieval, information extraction and text mining.
My M Tech Thesis with Prof Soumen Chakrabarti involved developing CSAW, a system for Curating and Searching the Annotated Web. CSAW annotates named entities in Web-scale text corpora, and, where confident, connects these annotations with entries in an entity and type catalog such as Wikipedia. The semi-structured catalog, together with the unstructured corpus, forms a composite database that CSAW can then search using powerful reachability, proximity and aggregation primitives. CSAW comprises of billions of annotation links between a 500-million web page corpus and millions of entities known to Wikipedia. Prior to IITB days, I worked on middle-ware applications for 3.5 years.
Personal Home Page