information visualization is an emerging and powerful technique for understanding language. Online text visualization tools are immensely popular online for collaboration, analysis, and entertainment. Through our ethnographic research we have found that visualization is also an important tool for natural language engineers. In a synthesis of information visualization and natural language processing research, my thesis work has approached visualization of language from several perspectives, including visual text analytics and explaining computational linguistic models with visualization. In this talk I will present an overview of several recent projects.
With the ever-growing body of electronic text, opportunities for text visualization in the information retrieval and text analytics processes are expanding. The DocuBurst visualization of document content uses the WordNet ontology as a basis for creating interactive visual document summaries, which can be explored to understand the content and character of very long electronic documents. While DocuBurst provides deep visualizations of a single document at a time, we have also created an analysis system for discovering and exploring the linguistic and content differences amongst the hundreds of thousands of decisions of the US Courts of Appeals. Early feedback from legal scholars has been quite positive.
Computational linguistics often uses statistical models of language – models with uncertainty built in. However, the end-user often only sees the results of the application of these models as a single text string. I will describe our interactive visualization for exposing and explaining the uncertainty in statistical machine translation, as embedded within a cross-lingual chat system. To further aid computational linguistic analysis, we have also created a method for comparing and relating multiple 2D visualizations in a restricted 3D space.