Discourse Profiler: A Computer Tool for Documenting, Modeling and Quantifying Natural Language Texts

Linguistic work on language texts has until now lacked a serious software tool that  integrates basic linguistic theory and discourse theory.  Now with the introduction of the Discourse Profiler software package linguists can analyze the entire context of specific discourse features.[1]   This can be done by interacting with the text model and/or by quantifying the text.  These features are summarized  in the list below along with a summary of how this is done:

  1. A visual model of a text is analagous to a roadmap.  Different size cities are represented by an iconic change in size.  In a text different noun phrases can be identifed by a different shape, such as using circles and squares to contrast nominative case and accusative case.  Colors in roadmaps are often used to contrast geographical things such as blue for water.  Colors in Discourse Profiler are used to represent semantic or grammatical details such as red for semantic agent, and black for semantic patient.  In addition to the basic participant tracking that is displayed in map-like format, syntactic and/or discourse information such as same subject/different subject, event/non-event can be traced parallel to the basic text grid.  This allows for analysis of a range of possibilities and the interactive capabilities adds the further help of trying ‘what if’ easily and rapidly for shifting to various hypotheses and to eliminate or elucidate patterns.
  2. A range of statistical options often used to analyze texts manually is now automated (e.g. the time saved can be put into analyzing a larger number of texts).  The quantification of texts ranges from various topic continuity statistics to basic stats on number of noun phrases for each participant tracked.
  3. The annotation system includes a metatagging system that allows the integrated approach to text analysis described here.  The tagging system is easily utilized in the Toolbox software that many linguists already use and are familiar with.  The tagging system provides a solution and systematic method for documenting texts.



[1] Earlier beta versions were called Multilinear Discourse Analysis (MDA).  The first beta version was demonstrated in 1996 (see Quick 1996).