Skip to content

2. Fundamentals

John W. DuBois edited this page Aug 2, 2019 · 24 revisions

Data Structures
Word
Unit
Prosodic Sentence
Side
Discourse
Link
Chain
Tier
Gap
Move
State
Syntax
Dialogic syntax
Linear syntax
Graph
Graph stats
Text Annotation Graphs

Rezonator is founded on several basic principles, the most important of which is resonance. Resonance is a relationship between elements, not a property of the elements themselves. As such it must be analyzed as it arises, in the context of language use. Resonance can be static or dynamic; predictable from the properties of the language or context, or improvised ad hoc in the dialogic moment. Rezonator seeks to represent all kinds of resonance: whatever is perceivable as resonance by the participants themselves. Resonance relationships may arise along any dimension of language; indeed, along multiple dimensions at once. Analytical tools with maximum flexibility are needed for representing the full complexity of the array of resonant relationships. Among the most fundamental tools for creating a markup of resonance in Rezonator are the link and the chain, described in the following sections.

Data structures

The following data structures are intended for importing spoken discourse data into Rezonator (and for export as well). They are for processing spoken discourse materials transcribed according to the conventions of Discourse Functional Transcription (DFT), such as the Santa Barbara Corpus. Input data may come from XML files, plain text, etc. The data structures are intended for use in data structures for Rezonator (such as CSV files, Pandas dataframes, etc.).

A key feature of the prosody-grammar analysis used here is the "place value" of each event in the discourse. This measures the location of an event (e.g. word, laugh, vocalism, pause) relative to the prosodic landmarks defined by the boundaries of the intonation unit. Thus for a given word, it may prove useful to know its location relative to the beginning of the intonation unit. By the same token, it may be equally valuable to know its location relative to the end of the intonation unit.

Word

This table is designed to represent the key features of each word in the discourse. This table is intended for importing and exporting data, so the same structure (more or less) should be used for files of the type word.csv.

Note that the so-called "word" table actually includes all tokens. As used here, "token" means roughly anything in the transcription that is bounded by whitespace. Thus, the so-called "word" field includes not only all real words, but also pause, laughter, breath, grunts (= vocalisms), transcriber comments, endNote (prosodic closure, marked the end of the intonation unit by comma, period, question mark, etc.), and so on.

In addition to the usual focus on identifying the grammatical features of a word, such as its "class" (=part of speech), there is detailed attention to timing and prosody. Information about word timing will be incorporated in due course; for the present, the only information available is the start time and end time for each intonation unit (as measured in seconds, from the beginning of the conversation). (For pauses, lenSec is the length of the pause in seconds.)

For spoken discourse, the nature of events and their timing are both critical. Indeed, events and time are intertwined. An event is anything that takes place in time: it has a start time, an end time, and a (non-zero) duration. From a prosodic perspective it is important to pay attention to all vocal and gestural events, including words, laughs, breaths, pauses, and so on. One way to evaluate the timing of words (and other discourse events) is to measure where they occur relative to a critical landmark: in this case, relative to the start and end boundaries of the intonation unit. We can call this the "place" value of a word or an event. There are at least three ways of representing place value for events in the intonation unit.

  1. Place. The first method counts words, assigning an integer value (1,2,3...) to each word as its "Place" value. That is, starting from the first word of the intonation unit (the "left edge"), an integer value is assigned for each word. The "place value" of the first word of the Unit is "1", the place value of the second word is "2", etc.
  2. Back. Second, this value treats the end of the intonation unit (the "right edge") as the landmark, and measures distance from there. This method counts off the place value for each word relative to the last word of the intonation unit, counting in negative integers, assigning "-1" to the last word of the intonation unit, "-2" to the second to last word, and so on.
  3. Order. The most general method counts all events, assigning an integer value (1,2,3...) to each token as its "Order" value. That is, starting from the beginning of the intonation unit (the "left edge"), an integer value is assigned for each token (whether a word, breath, laugh, pause, etc), beginning with "1".

When combined with the two index values for the discourse and the intonation unit, "Order" yields a uniquely identifiable index for each token in the corpus, including each word. This index is a three-part combination of the fields "discoID", "uID", and "Order".

BILUO scheme. An alternative notation for annotating the relation of tokens to units is the BILUO scheme (Ratinov and Roth 2009). This shows where the token appears within the unit currently being coded. It is useful for machine learning, and is used by many NLP processes, including Spacy. It can be used to annotate the Word table for several different levels of units or features (e.g. intonation unit, phrase, prosodic sentence, overlap, vox, etc.). BILUO values are written in CAPS for greater distinctiveness and legibility (when reading down a column of BILUO entries).

Tag Meaning Description
B begin The first token of a multi-token unit
I in An inner token of a multi-token unit
L last The final token of a multi-token unit
U unit A single-token unit
O out A token that is not a unit of the type being annotated

Sort order. The default sort order for the Word table is based on 3 fields: DiscoID, uID, wID. (A more or less equivalent sort is discoID, unitStartTime, Order (or Place of the Word, or startTime of the Word). But to identify the prosodic sentences (e.g. pSentID), it is necessary to use a special sort, sorting by "Side": discoID, pID, uID, wID. (Or: discoID, pID, unitStartTime, Place.)

Word grid structure

Field Description
discoID Index value for each discourse; a unique value for each conversation in the corpus (e.g. SBC001, SBC002, SBC003... in the Santa Barbara Corpus)
uID Index value of the intonation unit that the current word belongs to. (Intonation units are numbered 1,2,3... per each conversation.) (NOTE: This is a corpus value, with numbering restarting at 1 for each conversation. This is different from the unitID generated internally by Rezonator.)
wID Index of the current token. (Tokens, including words, breaths, pauses, etc., are numbered 1,2,3... per each conversation.) (NOTE: This is a corpus value, with numbering of tokens restarting at 1 for each conversation. This is different from the wordID generated internally by Rezonator.)
pID 1, 2, 3... (participantID)
speaker name of speaker, participant, or agent
word word form ("clean" spelling, as shown in the "w" field of the SBC XML file)
text detailed transcription of the word (as shown in the "dt" field of the SBC XML file)
lemma lemma of the word
kind {word, pause, breath, laugh, grunt, action, endnote, prosody, tag, comment, unknown, other}
pos part of speech {noun, verb, adjective, adverb, conjunction, interjection}
tag {NN [noun], JJ [adjective], ...} (using tags as in Spacy [cf. Penn TreeBank])
clitic FALSE (TRUE if the word is a contraction)
morph morpheme analysis: prefix, suffix
gloss translation of word into contact language, or gloss according to Leipzig Glossing Conventions
phones phonemic transcription (automatic) in the International Phonetic Alphabet (IPA, Unicode)
break break word (truncated word, cut-off word)
closure {open, close, break} (Based on endNoteChar: open = "," close = "." close = "?" break = "--" )
pause Is this token a pause? (1 if it's a pause, 0 if it's anything else)
microPause Is this token a microPause? (1 if it's a microPause, 0 if it's anything else)
overlap BILUO scheme (to determine BILUO, check if text contains "[" or "]" or "|" etc.)
quality BILUO scheme (applies to quality/manner notation with etc.)
qualityOffsets count off the characters in the word
startTime start time for the word (or other token) in seconds
endTime end time for the word (or other token) in seconds
order Index value for each token in the current intonation unit, counting up from the first token (1, 2, 3...)
place Index value for each word in the current intonation unit, counting up from the first word (1, 2, 3...), and ignoring anything that is not a word.
back Reverse index value for each word in the current intonation unit, counting down (backwards) from the last word of the current intonation unit (-1,-2,-3), and ignoring anything that is not a word.
pSent prosodic sentence ID: integer identifier (for each conversation, restart numbering at 1)
sent syntactic sentence ID: integer identifier (for each conversation, restart numbering at 1)
turn turn ID: integer identifier (for each conversation, restart numbering at 1)
lenSec length of the word (or other token) in seconds
wordTempo ratio of token duration to average for this word type
unitStartTime start time for the intonation unit
unitEndTime end time for the intonation unit
unitLenWords count of total words (not tokens) in the current unit

Unit

This table is designed to represent the key features of each unit (e.g. intonation unit) in the discourse. This table is intended for importing and exporting data, so it should match fairly closely to the structure of the file unit.csv.

The pSentID represents an integer index value that is generated in order to identify which prosodic sentence the current intonation unit belongs to (for details, see the discussion of the Prosodic Sentence below.)

Field Description
discoID Index value for each discourse; a unique value for each conversation in the corpus (e.g. SBC001, SBC002, SBC003... in the Santa Barbara Corpus)
uID Index value of the intonation unit that the current word belongs to. (Intonation units are numbered 1,2,3... per each conversation.) (NOTE: This is a corpus value, with numbering restarting at 1 for each conversation. This is different from the unitID generated internally by Rezonator.)
pID 1, 2, 3... (participantID)
speaker name of speaker, participant, or agent
startTime start time for the unit in seconds
endTime end time for the unit in seconds
startNote na
endNoteChar {"," , "." , "--" , "?" }
closure {open, close, break} [open = "," close = "." close = "?" break = "--" ]
lenSec length of the unit in seconds
lenWords length of the unit in words
lenTokens length of the unit in tokens
pauseCount count of pauses in this unit (sum of values in Word table)
microPauseCount count of micropauses in this unit (sum of values in Word table)
pSentID integer identifier indicating which prosodic sentence the current intonation unit belongs to
gapSec gap between this unit and the immediate previous unit, uttered by any speaker, in seconds
pauseSec pause (silence) that is internal to the current unit, i.e. silent pauses (sum in seconds)
words "clean" version of the text of the intonation unit, containing the words in sequence, but not prosody, manner, etc. (Word truncation and inaudible words should appear, however.)
text text of the intonation unit as it appears in the original transcription (including words, pauses, lag, overlap, prosody, etc.)

Prosodic Sentence

This table is designed to represent the key features of each prosodic sentence in the discourse. This table is designed for importing and exporting data, so it should match the structure of the file pSent.csv. The prosodic sentence identifier ("pSentID") represents an integer value that is generated in order to uniquely identify each prosodic sentence.

Implicit in the organization of this data structure (and of the unit structure as well) is a strategy for identifying prosodic sentences. Prosodic sentences are constructed by concatenating one or more intonation units, based on their EndNote (open, close, etc. as denoted by comma, period, etc.). One reason for identifying prosodic sentences, beyond their own intrinsic interest, is to improve the tagging and parsing of the data by providing a larger context beyond just the intonation unit (which often represents just a fragment of a clause).

Prosodic sentences are derived from the existing DFT transcriptions, based on the closure status of the endnote (cf. transitional continuity or endtone) of each intonation unit. Intonation units that are "open" (marked by comma) are concatenated, until a "close" intonation unit is reached; this becomes the final intonation unit in the prosodic sentence. The intonation units that make up a given prosodic sentence will be assigned the same pSentID. (That is, the pSentID number is continued for one or several intonation units, as long as each unit's endNote value is "open", and until a unit is encountered with an endNote value of "close". The result is that all the intonation units that make up a given prosodic sentence will have the same pSentID.

The most effective way to construct prosodic sentences automatically (i.e. barring a return to the audio data) is to process the Word table, which contains the relevant values for EndNote (a.k.a. "end tone" or "transitional continuity" [TC]). An algorithm that processes the EndNote value allows the assignment of each intonation unit (in the Unit table) to a unique prosodic sentence. Each time a prosodic sentence is identified, an integer ID value (pSentID) is generated for it. This pSentID is written first into the Unit table. Starting from the enriched Unit table, we use this to construct the prosodic sentence table. We can then select all the elements (words, etc.) that belong to a given prosodic sentence.

To provide evidence supporting the location (and strength) of each prosodic sentence boundary, it is also useful to collect information about the gaps (if any) that precede and follow a given prosodic sentence. The Gap concerns the "empty" space between same-side prosodic sentences: that is, the gap between the current prosodic sentence and the next prosodic sentence in the same side (by the same speaker). The gap between one same-side prosodic sentence and the next may consist of a few seconds of silence, or words spoken by a different speaker. Either way, it can be measured, whether in seconds or words. Collecting data on gaps has the potential to yield important insights about the production and processing of intonation units, prosodic sententences, and turns.

The default sort order for the prosodic sentence table should be (1) discoID (2) start. Sorting by the start time will put the prosodic sentences in the temporal sequence in which they were uttered in the conversation. (Another useful sort is by participant or side, i.e. by pID.)

Field Description
discoID Index value for each discourse; a unique value for each conversation in the corpus (e.g. SBC001, SBC002, SBC003... in the Santa Barbara Corpus)
pID participantID
speaker name of speaker, participant, or agent
pSentID prosodic sentence ID: integer identifier (for each conversation, restart numbering at 1)
startTime start time of prosodic sentence
endTime end time of prosodic sentence
lastEndNote endNote (comma, period, question mark, double hyphen) of the last intonation unit in the prosodic sentence
lastClosure closure (open, close, break) of the last intonation unit in the prosodic sentence
pauseCount count of pauses in this unit (sum of values in Word or Unit table)
microPauseCount count of micropauses in this unit (sum of values in Word or Unit table)
lenSec length of the prosodic sentence in seconds
lenWords length of the prosodic sentence in words
lenUnits length of the prosodic sentence in units (e.g. intonation units)
gapSec gap between the current and immediate prior prosodic sentence by any speaker, in seconds
gapWords gap between the current = and immediate prior prosodic sentence by any speaker, in words (spoken by other speakers)
pauseSec total pauses (silence) that are internal to the current unit, i.e. silent pauses (sum in seconds)
words "clean" version of the text of the prosodic sentence, containing the words from each intonation unit in the prosodic sentence, plus the unit-final prosody for each intonation unit (that is, the "endNote", e.g. comma, period, question mark, or double hyphen [for IU truncation]). (Do not include "not applicable" [n/a or nan, etc.] unless there are no actual words in the whole prosodic sentence.)

Side

This table is designed to represent the key features of each side in the conversation (that is, the set of all utterances produced by a given speaker, but only by that speaker, and only in that conversation).

Field Description
discoID Index value for each discourse; a unique value for each conversation in the corpus (e.g. SBC001, SBC002, SBC003... in the Santa Barbara Corpus)
sideID unique ID for each side of the conversation (for each conversation, restart numbering at 1)
pID participantID
speaker name of speaker, participant, or agent for this side
lenSec length of the side in seconds
lenWords length of the side in words

Discourse

This table is designed for annotating important details about the discourse (e.g. conversation, business meeting, lecture, etc.)

Field Description
discoID Index value for each discourse; a unique value for each conversation in the corpus (e.g. SBC001, SBC002, SBC003... in the Santa Barbara Corpus)
title [Descriptive title of this discourse]
corpus "Santa Barbara Corpus of Spoken American English"
corpusShort "SBC"
[misc] (Various details about the speech event and recording, in several fields)
lenSec length of the discourse in seconds
lenWords length of the discourse in words

Link

The concept of a link represents a powerful tool for showing relationships of all kinds in Rezonator. Links include traditional relationships between two words, such as:

  • coreference between a referent and its antecedent
  • resonance between a word in one sentence (target) and a prior word (base) in a parallel sentence
  • syntactic dependency between a head and its dependent (that is, a head and its modifier)
  • linear sequence between the first word and the second word of a clause

The key to the Link grid is the relationship between source (first node), goal (second node), and role (the quality of the relationship between source and goal, which labels the link or edge).

Crucially, Links inherit their Tier from the Chain they belong to (along with their Plane and SubTier; see details under Chain).

Name Description
linkID unique integer identifier for each link in the database (independent of the type of link, chain, tier, etc.)
chainID unique integer identifier specifying which chain this link belongs to (unique for each chain in the database)
source node from which the link begins (e.g. head word) (by default, the first word the user clicks on)
goal node to which the link connects (e.g. dependent word, modifier) (by default, the second word the user clicks on)
role relation between source and goal (e.g. syntactic dependency type, thematic role, etc.)
tier type of link
tags list of features or qualities, which are attributes of the link and/or relation
arrow the link is directed from source (first) to goal (second) (default)
invert arrow direction is the reverse of the default, from goal (second) to source (first) (i.e. invert = FALSE)
dead current state of the link, dead or live (default: dead = FALSE). Dead links are created when the user deletes a link (but Rezonator retains a record of it, in order to keep clickTime information, etc.)
auto current state of the link, automatic or user-generated (default: auto = FALSE). (auto = link generated by Rezonator, as part of re-sorting of links in a chain)
sourceClickTime timestamp when user began marking the link (e.g. mouseDown on source word)
goalClickTime timestamp when user completed marking the link (e.g. mouseUp on goal word)
roleTime timestamp when user specified role/relation

Chain

A chain is built up from a set of links, which may be directed ("arrow") or undirected. Each chain is identified by a unique chainID.

Several pairs of links can be combined to form a series. For example, word 7 links to word 11, then 11 links to 14, 14 links to 23, and so on. When several such links are made in succession, the result is a chain. Each chain gets its own unique identifier (chainID, an index value). For the most part, the chain takes its character from the links that make it up, that is, from the linked elements it contains. That is, the raison d'etre of a chain is the words (or other elements) that are linked together to form the chain. In addition, the chain may have some properties of its own (such as chainType or Tier). Chains can be sorted according to various properties of the elements in the chain (e.g. the startTime of the words in the chain).

Links and chains in Rezonator can be thought of in terms of graph theory concepts.

  • Links are edges (or arcs), which specify the relationship between two nodes (or vertices, or points, representing e.g. words).
  • Links can define an "arrow" which is "directed" (pointing from source to goal) or "undirected" (symmetric or bidirectional).
    For more information, see Graphss.

Chains can be sorted (up or down) according to the properties of the elements they contain. For example, the words in a chain may be sorted by the startTime for each word, by alphabetical order, by length (in words, letters, or phonemes), by the user's clickTime, and so on. In addition, the links in a chain may be sorted according to qualities of the links themselves (link type, role).

The Chain grid contains relatively little information of its own, since most information is contained in the Links, or in the linked elements themselves (e.g. words).

Name Description
chainID unique integer identifier for the current chain (sort by ChainId, then linkID)
tier subcategory of planes: semantic rez, pragmatic rez, phonetic rez, etc. (Tiers of the Rez plane)
subTier subcategory of tiers: rhyme, alliteration (subTiers of the Phonetic tier)
plane general type of chain: rez, track, stack...
tags list of tags representing features or qualities of the chain
dead current state of the chain, dead or live (default: dead = FALSE). Dead chains are created when the user deletes all the links (but Rezonator retains a record of it, in order to keep clickTime information, etc.)
sortOrder norm, down, up

Tier

Chains may be assigned to Tiers. Different Chain types (since Chains may be of any kind) may be interpreted hierarchically or non-hierchically. For example, one version of a hierarchical structure would recognize Planes (Rez, Track, Stack...) as the highest level. Planes can in turn be subcategorized into Tiers and subTiers. Alternatively, the relationship between tiers can be more ad hoc.

This section needs to be developed.

Gap

For every link, there is (potentially) a gap, which has interesting properties of its own. For example, when a resonance link is recognized between a word in line 7 and a prior word in line 4, there is a gap: a set of words that occurs between them. This set of words may have properties which shed light on the link itself (e.g. more words in the gap may imply greater demands on memory, and hence lower accessibility for the second word of the link). In Rezonator, the concept of gap is always tied to an explicit link: we only speak of a gap in relation to a link that has already been specified. Because every gap implies a unique link, we can identify any gap using its corresponding linkID.

Gaps occur between the two elements in a link. The gap is defined by its link, and thus inherits the linkID. The gap may have interesting properties, such as how long it is (measured in words, seconds, intonation units, etc.), and what words are included within it (e.g. pronouns of the same gender as the linked pronouns, which may compete with them for anaphora).

Name Description
linkID Each entry in the gapGrid corresponds to a unique link, with its own linkID
gapListWords list of words (if any) occurring in the gap between source and goal (denoted by wordID)
gapListTokens same, but for all tokens (words, pause, laugh, grunt...) (denoted by wordID)
gapWordCount number of words in the gap between source and goal (integer value)
gapTokenCount number of tokens (words, pause...) in gap between source and goal (integer value)
gapUnitCount number of intonation unit boundaries in the gap (integer value)

Move

The functions that are needed by Rezonator users (players, researchers, analysts, and others) may be thought of in terms of Moves, Modes, States, and Tiers.

  • A move is an action performed by a user (player, researcher, analyst).
  • A mode is a state of the Rezonator, corresponding to an activity that is relevant to the user's current situation, relative to the ongoing game or research activity, etc.
  • A state is a value of an object (e.g. cursor, word, link, chain, pane, etc.) that reflects the currently relevant status of the object and its associated activity or context. As such, a key role of any state is that it defines a set of currently available options for the user.
  • A tier is a set of related items of information, representing a certain layer of information, and containing related types of values. Tiers are of two basic types: entity tiers, and link tiers

State

State is a tuple that defines the currently activated state of a given word, line, stack, or other object, along several key dimensions. The state of an object changes dynamically based on user actions, and is frequently updated in the display (for example, by highlighting a word that has been selected by the user). States are sensitive to user input (such as hovering vs. clicking on an object) and to the current mode, environment, user activity, or game state.

It is useful to think of each state variable as defining a distinct set of alternatives (a paradigm to choose from). For example, at any given moment, a word object will have exactly one value for its state of Focus: it is Focused , Inactive, etc. In addition, the same object will have a value for Play, saying whether it should currently be played, muted, etc. Other State parameters include Shape (the shape of the object as displayed on the screen) and Place (where a word appears, relative to its usual location in the sequence of words).

State Description
Focus Activation of a word, unit, or line of text in the current context
plain Standard format: visible, normal, neutral appearance
hover Responds to proximity of user's cursor, when hovering over item
select unique focus of user's current attention (as when user clicks on a word to select it)
focus secondary focus, for elements that accompany the Select element
inert Inactive, grayed out
--- ---
Shape Shape of WordMask for a word, unit, or line of text
text size and shape of the word form, as written using the current font
block rectangle of standardized size, sized to fit in current grid
pill oblong shape
ball circle (standardized size; or, variable, big enough to hold the word)
puzzle block with connectors (like puzzle piece, Lego)
brick nested shape (like WordBricks)
bio living, pulsating, amoeba-like form
free variable, ad hoc shape
--- ---
Margin
edge margin around the word's text shape, measured in pixels or centimeters
fixed size of the word shape is fixed, not responsive to the number of letters
--- ---
Home Where a word appears, relative to its expected location
home The word appears in its expected place, in the sequence of words ("norm")
free Word is free to be moved away from its home/normal location.
roam Word actively roams around in the environment
book The word lives in a book (dictionary/thesaurus), in an alphabetical list
--- ---
Play
play Play audio/video now [spacebar]
pause Pause audio/video now [spacebar]
mute No sound will be heard, even if object is playing
n/a Not applicable (the object has no associated audio/video; e.g., a comment)
--- ---
Star
star marked as prominent, memorable, retrievable; more than 1 object may be starred (possibly marked for future action) (default = FALSE)
--- ---
Dead
dead Doesn’t respond to user or other input, play a role in the game, get displayed on screen, etc. (default = FALSE)

WordMask

WordMask = masking, hiding, encrypting, or obscuring the visual form of a word, unit, or line of text.
WordMask is especially relevant for game play; also for quizzes/tutorials.
There are 7 different states for WordMask.

State Description
show Word text is visible, readable, and normal in form (default)
hide Word text is invisible
mask Word text is covered by a visible place holder
rune Word text is transliterated into "runes" (mysterious letters in an exotic alphabet, decodable)
crypto Word text is encrypted or randomly substituted with "runes" (mysterious letters)
dark Word text is like dark matter: invisible, but it affects nearby objects (with collisions etc.)
gone Word text has no effect at all on display

MaskUnit = The size of the unit to be masked: word, unit, or line of text.
MaskUnit is especially relevant for game play; also for quizzes/tutorials.
There are 3 different states for MaskUnit.

State Description
word individual words are masked, preserving white space between them
unit each unit as a whole is masked, obscuring the spaces between words (and thus hding the number of words)
line each line as a whole is masked, obscuring the spaces between words (and thus hiding the number of words)

Syntax

Dialogic syntax

Syntax is a big part of what the Rezonator seeks to represent, since it is built on the theory of Dialogic Syntax (Du Bois 2014). To understand dialogic resonance in everyday interaction, Rezonator seeks to represent how speakers use syntax to create the structure of engagement. When two people coordinate their goals, ideas, and stances, they often coordinate their words and structures as well. By producing two parallel syntactic structures, they implicit invoking a relation between their utterances (dialogic resonance).

Linear syntax

To adequately model dialogic syntax, Rezonator must analyze linear syntax as well. (Linear syntax is what most linguists just call syntax. From the dialogic perspective, it is called linear because it typically involves a single linearized sequence of words, organized according to sentence-level syntactic structures (e.g. dependencies). Linear syntax as such does not typically represent mappings between multiple parallel structures; this is the purview of dialogic syntax.)

Among theories of linear syntax, dependency grammar promises to provide the necessary flexibility. Dependency representations offer the flexibility needed to go beyond linear syntax, to represent the dynamic flux of syntactic relations both within and across sentences.

To represent sentence-level syntactic structure (a.ka. linear syntax) in the Santa Barbara Corpus, we anticipate using one or more of the following for tagging and dependency parsing:

Graph

The task of modeling language as it is used in naturally occurring dialogic interaction means paying attention to the dynamic structural relations that are constructed in real time by interlocutors. For this, graph theory can be used. Graphs and graph theory are important for creating meaningful representations of both linear syntax (already well developed elsewhere) and dialogic syntax (developed in Rezonator), as well as for higher-level discourse functions including stance alignment, topic structure, rhetorical structure, etc.

In Rezonator, many elements can be thought of as a simple graph. For example, the following are represented mostly by acyclic, directed, non-branching graphs:

  • Unit: A sequence of words in a linguistic unit (intonation unit, clause, etc.)
  • Line: A sequence of words as displayed in a horizontal row on the screen (usually one unit)
  • Sequence: An ordered list of units or lines that make up a discourse
  • Chain: A list of words connected vertically, for example, resonating words in parallel utterances

For some purposes, it is useful to combine several simple graphs to create a larger, complex graph:

  • Discourse: A Sequence of lines that make up an entire conversation, where each Line contains a Place chain, composed of Words
  • Stack: A set of lines (e.g. on a single topic), where each Line contains a Place chain, composed of Words
  • Clique: A set of Chains whose fate is intertwined, because they share at least one line

Even the grid (or chess-board) in the main screen can be analyzed as a graph (lattice). Rezonator maps simple and complex graphs (lines, units, chains, etc.) onto the lattice graph.
Graph-related processes are also relevant to understanding the complex dynamics of dialogic resonance. Relevant processes include:

  • graph optimization
  • graph representation learning
  • machine learning on graphs
    Rezonator seeks to make the graph structure of both syntax and resonance discourse explicit, in order to support the application of new machine learning tools for representing and quantifying dialogic resonance.

Graph theory equivalents used in Rezonator
Rezonator categories of Word, Link, and Chunk are functionally equivalent to (and can be translated into) the graph theory categories of Node, Edge, and Plate, respectively.
Word = Node (Vertex)
Link = Edge
Chunk = Plate

Graph stats

Plate notation

As one alternative notation, consider plate notation. While the graph theory concepts of Node and Edge are familiar, Plate is less so. For a useful introduction, see the Wikipedia article on plate notation, and this blog on implementing plate notation in Python.

Parameters
For coding a visualization of each relevant Rezonator concept, it is useful to consider the parameters/variables used in the following:
Word = Node
Link = Edge
Box = Plate
For the larger context for these ideas (about notation for words, links, plates), see daft-pgm, especially their API and the PGM object.

For an alternative counter-voice, see Stop using Plate Notation. (But except for the cautions about readability, these objections do not necessarily apply to Rezonator's use of Plate concepts as a visualization tool.)

Text Annotation Graphs

To represent the full complexity of syntax in spontaneous discourse, something more is needed than the standard representation of one isolated sentence at a time. To fully represent the syntax in dialogic syntax, we anticipate using the Text Annotation Graph notation developed by Forbes et al. (2018a). The Text Annotation Graph notation is especially interesting for its capacity to represent relationships between relationships. For more information, see:

Text Annotation Graphs (paper)
Text Annotation Graphs (on GitHub)
Sample data in JSON (for Text Annotation Graphs)
Demo code (on GitHub)
Creative Coding Lab (Forbes)

References

Forbes, Angus G., Kristine Lee, Gus Hahn-Powell, Marco A. Valenzuela-Escárcega, and Mihai Surdeanu. (2018a) Text Annotation Graphs: Annotating Complex Natural Language Phenomena. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC'18). May 7-12, 2018, Miyazaki, Japan. Ed. by Sara Goggi and Hélène Mazo. European Language Resources Association (ELRA). arXiv preprint arXiv:1711.00529 (2017).
(The TAG software is available at https://github.com/CreativeCodingLab/TextAnnotationGraphs)

Forbes, Angus G., Andrew Burks, Kristine Lee, Xing Li, Pierre Boutillier, Jean Krivine, and Walter Fontana. (2018b) "Dynamic Influence Networks for Rule-based Models." IEEE Transactions on Visualization & Computer Graphics 24(1): 184-194. January, 2018. doi: 10.1109/TVCG.2017.2745280 https://arxiv.org/pdf/1711.00967.pdf

Dynamic Influence Networks

Taking it to the next level, we seek to model the dynamics of dialogic resonance in conversational interaction. One way to do this is through analysis of the causal influence of rules on rules, as when the application of a resonance rule in dialogic syntax influences the application of a rule of linear syntax (e.g. a construction in construction grammar), and vice versa. Another example: dynamic interaction of lexical priming and syntactic priming. For such interactions, a promising model is the visualization of Dynamic Influence Networks (Forbes et al. 2018b), which has been successfully applied to complex biological systems.

Skip-thoughts

We are also find interesting the potential for integrating skip-thought vectors in Rezonator.