Metatagging System

The heart of the Discourse Profiler program is a specially designed tagging system which allows different analytical procedures to be produced from one database.  For users already familiar with the Linguists Toolbox database management program, it is fairly straight forward to add additional fields which can be mined by the Discourse Profiler program.  For those fields the analyst wants to compare in the span analyses it is quite simple.  For each standard format marker an abbreviation, word or short phrase is given, e.g. for word orders simply follow something like:

\wo SVO

For other syntactic or discourse information you enter in words (or their abbreviations) such as ‘event’ or ‘non-event’, ‘same subject (SS)’ or ‘different subject (DS)’.  The span analyses handles simple binary contrasts, or clusters of information that you group later in the Discourse Profiler program into two categories (this allows the user to also try ‘what if’ scenarios rather easily).  For each participant or NP that the analyst wants to track, there are five pieces of information entered in for each special participant field, as shown below:

\p1 A_S1_N1+_S

The abbreviations given in this field for participant one are only representative, and many other possibilities exist.  In this case ‘A’ represents the agent or actor, the ‘S1’ represents the first of four possible categories for subject continuity (following Levinsohn 2003), ‘N1’ represents a basic noun phrase in the Absolute Case in the language I work in as opposed to the Genitive case (marked as ‘N2’), and the ‘+’ indicates this participant is definite as opposed to indefinite, and the ‘S’ indicates it is the grammatical subject.  Once a text is tagged in Toolbox then these metatag abbreviations are entered in the Discourse Profiler program.  Discourse Profiler mines the Toolbox database and creates a new database that is used for all other analytical purposes as described here.