Martin Franz, Salim Roukos
SIGIR Forum (ACM Special Interest Group on Information Retrieval)
Searching online information is increasingly a daily activity for many people. The multilinguality of online content is also increasing (e.g. the proportion of English web users, which has been decreasing as a fraction the increasing population of web users, dipped below 50% in the summer of 2001). To improve the ability of an English speaker to search mutlilingual content, we built a system that supports cross-lingual search of an Arabic newswire collection and provides on demand translation of Arabic web pages into English. The cross-lingual search engine supports a fast search capability (sub-second response for typical queries) and achieves state-of-the-art performance in the high precision region of the result list. The on demand statistical machine translation uses the Direct Translation model along with a novel statistical Arabic Morphological Analyzer to yield state-of-the-art translation quality. The on demand SMT uses an efficient dynamic programming decoder that achieves reasonable speed for translating web documents.
Martin Franz, Salim Roukos
SIGIR Forum (ACM Special Interest Group on Information Retrieval)
Deepak Ravichandran, Abraham Ittycheriah, et al.
NAACL-HLT 2003
K. Davies, R. Donovan, et al.
INTERSPEECH - Eurospeech 1999
A. Ittycheriah, L. Lita, et al.
NAACL-HLT 2003