-
Notifications
You must be signed in to change notification settings - Fork 2
2. Fundamentals
Data Structures
Word
Unit
Prosodic Sentence
Side
Discourse
Link
Chain
Tier
Gap
Move
State
Syntax
Dialogic syntax
Linear syntax
Graph
Graph stats
Text Annotation Graphs
Rezonator is founded on several basic principles, the most important of which is resonance. Resonance is a relationship between elements, not a property of the elements themselves. As such it must be analyzed as it arises, in the context of language use. Resonance can be static or dynamic; predictable from the properties of the language or context, or improvised ad hoc in the dialogic moment. Rezonator seeks to represent all kinds of resonance: whatever is perceivable as resonance by the participants themselves. Resonance relationships may arise along any dimension of language; indeed, along multiple dimensions at once. Analytical tools with maximum flexibility are needed for representing the full complexity of the array of resonant relationships. Among the most fundamental tools for creating a markup of resonance in Rezonator are the link and the chain, described in the following sections.
The following data structures are intended for importing spoken discourse data into Rezonator (and for export as well). They are for processing spoken discourse materials transcribed according to the conventions of Discourse Functional Transcription (DFT), such as the Santa Barbara Corpus. Input data may come from XML files, plain text, etc. The data structures are intended for use in data structures for Rezonator (such as CSV files, Pandas dataframes, etc.).
A key feature of the prosody-grammar analysis used here is the "place value" of each event in the discourse. This measures the location of an event (e.g. word, laugh, vocalism, pause) relative to the prosodic landmarks defined by the boundaries of the intonation unit. Thus for a given word, it may prove useful to know its location relative to the beginning of the intonation unit. By the same token, it may be equally valuable to know its location relative to the end of the intonation unit.
This table is designed to represent the key features of each word in the discourse. This table is intended for importing and exporting data, so the same structure (more or less) should be used for files of the type word.csv.
Note that the so-called "word" table actually includes all tokens. As used here, "token" means roughly anything in the transcription that is bounded by whitespace. Thus, the so-called "word" field includes not only all real words, but also pause, laughter, breath, grunts (= vocalisms), transcriber comments, endNote (prosodic closure, marked the end of the intonation unit by comma, period, question mark, etc.), and so on.
In addition to the usual focus on identifying the grammatical features of a word, such as its "class" (=part of speech), there is detailed attention to timing and prosody. Information about word timing will be incorporated in due course; for the present, the only information available is the start time and end time for each intonation unit (as measured in seconds, from the beginning of the conversation). (For pauses, lenSec is the length of the pause in seconds.)
For spoken discourse, the nature of events and their timing are both critical. Indeed, events and time are intertwined. An event is anything that takes place in time: it has a start time, an end time, and a (non-zero) duration. From a prosodic perspective it is important to pay attention to all vocal and gestural events, including words, laughs, breaths, pauses, and so on. One way to evaluate the timing of words (and other discourse events) is to measure where they occur relative to a critical landmark: in this case, relative to the start and end boundaries of the intonation unit. We can call this the "place" value of a word or an event. There are at least three ways of representing place value for events in the intonation unit.
- Place. The first method counts words, assigning an integer value (1,2,3...) to each word as its "Place" value. That is, starting from the first word of the intonation unit (the "left edge"), an integer value is assigned for each word. The "place value" of the first word of the Unit is "1", the place value of the second word is "2", etc.
- Back. Second, this value treats the end of the intonation unit (the "right edge") as the landmark, and measures distance from there. This method counts off the place value for each word relative to the last word of the intonation unit, counting in negative integers, assigning "-1" to the last word of the intonation unit, "-2" to the second to last word, and so on.
- Order. The most general method counts all events, assigning an integer value (1,2,3...) to each token as its "Order" value. That is, starting from the beginning of the intonation unit (the "left edge"), an integer value is assigned for each token (whether a word, breath, laugh, pause, etc), beginning with "1".
When combined with the two index values for the discourse and the intonation unit, "Order" yields a uniquely identifiable index for each token in the corpus, including each word. This index is a three-part combination of the fields "discoID", "uID", and "Order".
BILUO scheme. An alternative notation for annotating the relation of tokens to units is the BILUO scheme (Ratinov and Roth 2009). This shows where the token appears within the unit currently being coded. It is useful for machine learning, and is used by many NLP processes, including Spacy. It can be used to annotate the Word table for several different levels of units or features (e.g. intonation unit, phrase, prosodic sentence, overlap, vox, etc.). BILUO values are written in CAPS for greater distinctiveness and legibility (when reading down a column of BILUO entries).
Tag | Meaning | Description |
---|---|---|
B | begin | The first token of a multi-token unit |
I | in | An inner token of a multi-token unit |
L | last | The final token of a multi-token unit |
U | unit | A single-token unit |
O | out | A token that is not a unit of the type being annotated |
Sort order. The default sort order for the Word table is based on 3 fields: DiscoID, uID, wID. (A more or less equivalent sort is discoID, unitStartTime, Order (or Place of the Word, or startTime of the Word). But to identify the prosodic sentences (e.g. pSentID), it is necessary to use a special sort, sorting by "Side": discoID, pID, uID, wID. (Or: discoID, pID, unitStartTime, Place.)
Word grid structure
Field | Description |
---|---|
discoID | Index value for each discourse; a unique value for each conversation in the corpus (e.g. SBC001, SBC002, SBC003... in the Santa Barbara Corpus) |
uID | Index value of the intonation unit that the current word belongs to. (Intonation units are numbered 1,2,3... per each conversation.) (NOTE: This is a corpus value, with numbering restarting at 1 for each conversation. This is different from the unitID generated internally by Rezonator.) |
wID | Index of the current token. (Tokens, including words, breaths, pauses, etc., are numbered 1,2,3... per each conversation.) (NOTE: This is a corpus value, with numbering of tokens restarting at 1 for each conversation. This is different from the wordID generated internally by Rezonator.) |
pID | 1, 2, 3... (participantID) |
speaker | name of speaker, participant, or agent |
word | word form ("clean" spelling, as shown in the "w" field of the SBC XML file) |
text | detailed transcription of the word (as shown in the "dt" field of the SBC XML file) |
lemma | lemma of the word |
kind | {word, pause, breath, laugh, grunt, action, endnote, prosody, tag, comment, unknown, other} |
pos | part of speech {noun, verb, adjective, adverb, conjunction, interjection} |
tag | {NN [noun], JJ [adjective], ...} (using tags as in Spacy [cf. Penn TreeBank]) |
clitic | FALSE (TRUE if the word is a contraction) |
morph | morpheme analysis: prefix, suffix |
gloss | translation of word into contact language, or gloss according to Leipzig Glossing Conventions |
phones | phonemic transcription (automatic) in the International Phonetic Alphabet (IPA, Unicode) |
break | break word (truncated word, cut-off word) |
closure | {open, close, break} (Based on endNoteChar: open = "," close = "." close = "?" break = "--" ) |
pause | Is this token a pause? (1 if it's a pause, 0 if it's anything else) |
microPause | Is this token a microPause? (1 if it's a microPause, 0 if it's anything else) |
overlap | BILUO scheme (to determine BILUO, check if text contains "[" or "]" or "|" etc.) |
quality | BILUO scheme (applies to quality/manner notation with etc.) |
qualityOffsets | count off the characters in the word |
startTime | start time for the word (or other token) in seconds |
endTime | end time for the word (or other token) in seconds |
order | Index value for each token in the current intonation unit, counting up from the first token (1, 2, 3...) |
place | Index value for each word in the current intonation unit, counting up from the first word (1, 2, 3...), and ignoring anything that is not a word. |
back | Reverse index value for each word in the current intonation unit, counting down (backwards) from the last word of the current intonation unit (-1,-2,-3), and ignoring anything that is not a word. |
pSent | prosodic sentence ID: integer identifier (for each conversation, restart numbering at 1) |
sent | syntactic sentence ID: integer identifier (for each conversation, restart numbering at 1) |
turn | turn ID: integer identifier (for each conversation, restart numbering at 1) |
lenSec | length of the word (or other token) in seconds |
wordTempo | ratio of token duration to average for this word type |
unitStartTime | start time for the intonation unit |
unitEndTime | end time for the intonation unit |
unitLenWords | count of total words (not tokens) in the current unit |
This table is designed to represent the key features of each unit (e.g. intonation unit) in the discourse. This table is intended for importing and exporting data, so it should match fairly closely to the structure of the file unit.csv.
The pSentID represents an integer index value that is generated in order to identify which prosodic sentence the current intonation unit belongs to (for details, see the discussion of the Prosodic Sentence below.)
Field | Description |
---|---|
discoID | Index value for each discourse; a unique value for each conversation in the corpus (e.g. SBC001, SBC002, SBC003... in the Santa Barbara Corpus) |
uID | Index value of the intonation unit that the current word belongs to. (Intonation units are numbered 1,2,3... per each conversation.) (NOTE: This is a corpus value, with numbering restarting at 1 for each conversation. This is different from the unitID generated internally by Rezonator.) |
pID | 1, 2, 3... (participantID) |
speaker | name of speaker, participant, or agent |
startTime | start time for the unit in seconds |
endTime | end time for the unit in seconds |
startNote | na |
endNoteChar | {"," , "." , "--" , "?" } |
closure | {open, close, break} [open = "," close = "." close = "?" break = "--" ] |
lenSec | length of the unit in seconds |
lenWords | length of the unit in words |
lenTokens | length of the unit in tokens |
pauseCount | count of pauses in this unit (sum of values in Word table) |
microPauseCount | count of micropauses in this unit (sum of values in Word table) |
pSentID | integer identifier indicating which prosodic sentence the current intonation unit belongs to |
gapSec | gap between this unit and the immediate previous unit, uttered by any speaker, in seconds |
pauseSec | pause (silence) that is internal to the current unit, i.e. silent pauses (sum in seconds) |
words | "clean" version of the text of the intonation unit, containing the words in sequence, but not prosody, manner, etc. (Word truncation and inaudible words should appear, however.) |
text | text of the intonation unit as it appears in the original transcription (including words, pauses, lag, overlap, prosody, etc.) |
This table is designed to represent the key features of each prosodic sentence in the discourse. This table is designed for importing and exporting data, so it should match the structure of the file pSent.csv. The prosodic sentence identifier ("pSentID") represents an integer value that is generated in order to uniquely identify each prosodic sentence.
Implicit in the organization of this data structure (and of the unit structure as well) is a strategy for identifying prosodic sentences. Prosodic sentences are constructed by concatenating one or more intonation units, based on their EndNote (open, close, etc. as denoted by comma, period, etc.). One reason for identifying prosodic sentences, beyond their own intrinsic interest, is to improve the tagging and parsing of the data by providing a larger context beyond just the intonation unit (which often represents just a fragment of a clause).
Prosodic sentences are derived from the existing DFT transcriptions, based on the closure status of the endnote (cf. transitional continuity or endtone) of each intonation unit. Intonation units that are "open" (marked by comma) are concatenated, until a "close" intonation unit is reached; this becomes the final intonation unit in the prosodic sentence. The intonation units that make up a given prosodic sentence will be assigned the same pSentID. (That is, the pSentID number is continued for one or several intonation units, as long as each unit's endNote value is "open", and until a unit is encountered with an endNote value of "close". The result is that all the intonation units that make up a given prosodic sentence will have the same pSentID.
The most effective way to construct prosodic sentences automatically (i.e. barring a return to the audio data) is to process the Word table, which contains the relevant values for EndNote (a.k.a. "end tone" or "transitional continuity" [TC]). An algorithm that processes the EndNote value allows the assignment of each intonation unit (in the Unit table) to a unique prosodic sentence. Each time a prosodic sentence is identified, an integer ID value (pSentID) is generated for it. This pSentID is written first into the Unit table. Starting from the enriched Unit table, we use this to construct the prosodic sentence table. We can then select all the elements (words, etc.) that belong to a given prosodic sentence.
To provide evidence supporting the location (and strength) of each prosodic sentence boundary, it is also useful to collect information about the gaps (if any) that precede and follow a given prosodic sentence. The Gap concerns the "empty" space between same-side prosodic sentences: that is, the gap between the current prosodic sentence and the next prosodic sentence in the same side (by the same speaker). The gap between one same-side prosodic sentence and the next may consist of a few seconds of silence, or words spoken by a different speaker. Either way, it can be measured, whether in seconds or words. Collecting data on gaps has the potential to yield important insights about the production and processing of intonation units, prosodic sententences, and turns.
The default sort order for the prosodic sentence table should be (1) discoID (2) start. Sorting by the start time will put the prosodic sentences in the temporal sequence in which they were uttered in the conversation. (Another useful sort is by participant or side, i.e. by pID.)
Field | Description |
---|---|
discoID | Index value for each discourse; a unique value for each conversation in the corpus (e.g. SBC001, SBC002, SBC003... in the Santa Barbara Corpus) |
pID | participantID |
speaker | name of speaker, participant, or agent |
pSentID | prosodic sentence ID: integer identifier (for each conversation, restart numbering at 1) |
startTime | start time of prosodic sentence |
endTime | end time of prosodic sentence |
lastEndNote | endNote (comma, period, question mark, double hyphen) of the last intonation unit in the prosodic sentence |
lastClosure | closure (open, close, break) of the last intonation unit in the prosodic sentence |
pauseCount | count of pauses in this unit (sum of values in Word or Unit table) |
microPauseCount | count of micropauses in this unit (sum of values in Word or Unit table) |
lenSec | length of the prosodic sentence in seconds |
lenWords | length of the prosodic sentence in words |
lenUnits | length of the prosodic sentence in units (e.g. intonation units) |
gapSec | gap between the current and immediate prior prosodic sentence by any speaker, in seconds |
gapWords | gap between the current = and immediate prior prosodic sentence by any speaker, in words (spoken by other speakers) |
pauseSec | total pauses (silence) that are internal to the current unit, i.e. silent pauses (sum in seconds) |
words | "clean" version of the text of the prosodic sentence, containing the words from each intonation unit in the prosodic sentence, plus the unit-final prosody for each intonation unit (that is, the "endNote", e.g. comma, period, question mark, or double hyphen [for IU truncation]). (Do not include "not applicable" [n/a or nan, etc.] unless there are no actual words in the whole prosodic sentence.) |
This table is designed to represent the key features of each side in the conversation (that is, the set of all utterances produced by a given speaker, but only by that speaker, and only in that conversation).
Field | Description |
---|---|
discoID | Index value for each discourse; a unique value for each conversation in the corpus (e.g. SBC001, SBC002, SBC003... in the Santa Barbara Corpus) |
sideID | unique ID for each side of the conversation (for each conversation, restart numbering at 1) |
pID | participantID |
speaker | name of speaker, participant, or agent for this side |
lenSec | length of the side in seconds |
lenWords | length of the side in words |
This table is designed for annotating important details about the discourse (e.g. conversation, business meeting, lecture, etc.)
Field | Description |
---|---|
discoID | Index value for each discourse; a unique value for each conversation in the corpus (e.g. SBC001, SBC002, SBC003... in the Santa Barbara Corpus) |
title | [Descriptive title of this discourse] |
corpus | "Santa Barbara Corpus of Spoken American English" |
corpusShort | "SBC" |
[misc] | (Various details about the speech event and recording, in several fields) |
lenSec | length of the discourse in seconds |
lenWords | length of the discourse in words |
The concept of a link represents a powerful tool for showing relationships of all kinds in Rezonator. Links include traditional relationships between two words, such as:
- coreference between a referent and its antecedent
- resonance between a word in one sentence (target) and a prior word (base) in a parallel sentence
- syntactic dependency between a head and its dependent (that is, a head and its modifier)
- linear sequence between the first word and the second word of a clause
The key to the Link grid is the relationship between source (first node), goal (second node), and role (the quality of the relationship between source and goal, which labels the link or edge).
Crucially, Links inherit their Tier from the Chain they belong to (along with their Plane and SubTier; see details under Chain).
Name | Description |
---|---|
linkID | unique integer identifier for each link in the database (independent of the type of link, chain, tier, etc.) |
chainID | unique integer identifier specifying which chain this link belongs to (unique for each chain in the database) |
source | node from which the link begins (e.g. head word) (by default, the first word the user clicks on) |
goal | node to which the link connects (e.g. dependent word, modifier) (by default, the second word the user clicks on) |
role | relation between source and goal (e.g. syntactic dependency type, thematic role, etc.) |
tier | type of link |
tags | list of features or qualities, which are attributes of the link and/or relation |
arrow | the link is directed from source (first) to goal (second) (default) |
invert | arrow direction is the reverse of the default, from goal (second) to source (first) (i.e. invert = FALSE) |
dead | current state of the link, dead or live (default: dead = FALSE). Dead links are created when the user deletes a link (but Rezonator retains a record of it, in order to keep clickTime information, etc.) |
auto | current state of the link, automatic or user-generated (default: auto = FALSE). (auto = link generated by Rezonator, as part of re-sorting of links in a chain) |
sourceClickTime | timestamp when user began marking the link (e.g. mouseDown on source word) |
goalClickTime | timestamp when user completed marking the link (e.g. mouseUp on goal word) |
roleTime | timestamp when user specified role/relation |
A chain is built up from a set of links, which may be directed ("arrow") or undirected. Each chain is identified by a unique chainID.
Several pairs of links can be combined to form a series. For example, word 7 links to word 11, then 11 links to 14, 14 links to 23, and so on. When several such links are made in succession, the result is a chain. Each chain gets its own unique identifier (chainID, an index value). For the most part, the chain takes its character from the links that make it up, that is, from the linked elements it contains. That is, the raison d'etre of a chain is the words (or other elements) that are linked together to form the chain. In addition, the chain may have some properties of its own (such as chainType or Tier). Chains can be sorted according to various properties of the elements in the chain (e.g. the startTime of the words in the chain).
Links and chains in Rezonator can be thought of in terms of graph theory concepts.
- Links are edges (or arcs), which specify the relationship between two nodes (or vertices, or points, representing e.g. words).
- Links can define an "arrow" which is "directed" (pointing from source to goal) or "undirected" (symmetric or bidirectional).
For more information, see Graphss.
Chains can be sorted (up or down) according to the properties of the elements they contain. For example, the words in a chain may be sorted by the startTime for each word, by alphabetical order, by length (in words, letters, or phonemes), by the user's clickTime, and so on. In addition, the links in a chain may be sorted according to qualities of the links themselves (link type, role).
The Chain grid contains relatively little information of its own, since most information is contained in the Links, or in the linked elements themselves (e.g. words).
Name | Description |
---|---|
chainID | unique integer identifier for the current chain (sort by ChainId, then linkID) |
tier | subcategory of planes: semantic rez, pragmatic rez, phonetic rez, etc. (Tiers of the Rez plane) |
subTier | subcategory of tiers: rhyme, alliteration (subTiers of the Phonetic tier) |
plane | general type of chain: rez, track, stack... |
tags | list of tags representing features or qualities of the chain |
dead | current state of the chain, dead or live (default: dead = FALSE). Dead chains are created when the user deletes all the links (but Rezonator retains a record of it, in order to keep clickTime information, etc.) |
sortOrder | norm, down, up |
Chains may be assigned to Tiers. Different Chain types (since Chains may be of any kind) may be interpreted hierarchically or non-hierchically. For example, one version of a hierarchical structure would recognize Planes (Rez, Track, Stack...) as the highest level. Planes can in turn be subcategorized into Tiers and subTiers. Alternatively, the relationship between tiers can be more ad hoc.
This section needs to be developed.
For every link, there is (potentially) a gap, which has interesting properties of its own. For example, when a resonance link is recognized between a word in line 7 and a prior word in line 4, there is a gap: a set of words that occurs between them. This set of words may have properties which shed light on the link itself (e.g. more words in the gap may imply greater demands on memory, and hence lower accessibility for the second word of the link). In Rezonator, the concept of gap is always tied to an explicit link: we only speak of a gap in relation to a link that has already been specified. Because every gap implies a unique link, we can identify any gap using its corresponding linkID.
Gaps occur between the two elements in a link. The gap is defined by its link, and thus inherits the linkID. The gap may have interesting properties, such as how long it is (measured in words, seconds, intonation units, etc.), and what words are included within it (e.g. pronouns of the same gender as the linked pronouns, which may compete with them for anaphora).
Name | Description |
---|---|
linkID | Each entry in the gapGrid corresponds to a unique link, with its own linkID |
gapListWords | list of words (if any) occurring in the gap between source and goal (denoted by wordID) |
gapListTokens | same, but for all tokens (words, pause, laugh, grunt...) (denoted by wordID) |
gapWordCount | number of words in the gap between source and goal (integer value) |
gapTokenCount | number of tokens (words, pause...) in gap between source and goal (integer value) |
gapUnitCount | number of intonation unit boundaries in the gap (integer value) |
The functions that are needed by Rezonator users (players, researchers, analysts, and others) may be thought of in terms of Moves, Modes, States, and Tiers.
- A move is an action performed by a user (player, researcher, analyst).
- A mode is a state of the Rezonator, corresponding to an activity that is relevant to the user's current situation, relative to the ongoing game or research activity, etc.
- A state is a value of an object (e.g. cursor, word, link, chain, pane, etc.) that reflects the currently relevant status of the object and its associated activity or context. As such, a key role of any state is that it defines a set of currently available options for the user.
- A tier is a set of related items of information, representing a certain layer of information, and containing related types of values. Tiers are of two basic types: entity tiers, and link tiers
State is a tuple that defines the currently activated state of a given word, line, stack, or other object, along several key dimensions. The state of an object changes dynamically based on user actions, and is frequently updated in the display (for example, by highlighting a word that has been selected by the user). States are sensitive to user input (such as hovering vs. clicking on an object) and to the current mode, environment, user activity, or game state.
It is useful to think of each state variable as defining a distinct set of alternatives (a paradigm to choose from). For example, at any given moment, a word object will have exactly one value for its state of Focus: it is Focused , Inactive, etc. In addition, the same object will have a value for Play, saying whether it should currently be played, muted, etc. Other State parameters include Shape (the shape of the object as displayed on the screen) and Place (where a word appears, relative to its usual location in the sequence of words).
State | Description |
---|---|
Focus | Activation of a word, unit, or line of text in the current context |
plain | Standard format: visible, normal, neutral appearance |
hover | Responds to proximity of user's cursor, when hovering over item |
select | unique focus of user's current attention (as when user clicks on a word to select it) |
focus | secondary focus, for elements that accompany the Select element |
inert | Inactive, grayed out |
--- | --- |
Shape | Shape of WordMask for a word, unit, or line of text |
text | size and shape of the word form, as written using the current font |
block | rectangle of standardized size, sized to fit in current grid |
pill | oblong shape |
ball | circle (standardized size; or, variable, big enough to hold the word) |
puzzle | block with connectors (like puzzle piece, Lego) |
brick | nested shape (like WordBricks) |
bio | living, pulsating, amoeba-like form |
free | variable, ad hoc shape |
--- | --- |
Margin | |
edge | margin around the word's text shape, measured in pixels or centimeters |
fixed | size of the word shape is fixed, not responsive to the number of letters |
--- | --- |
Home | Where a word appears, relative to its expected location |
home | The word appears in its expected place, in the sequence of words ("norm") |
free | Word is free to be moved away from its home/normal location. |
roam | Word actively roams around in the environment |
book | The word lives in a book (dictionary/thesaurus), in an alphabetical list |
--- | --- |
Play | |
play | Play audio/video now [spacebar] |
pause | Pause audio/video now [spacebar] |
mute | No sound will be heard, even if object is playing |
n/a | Not applicable (the object has no associated audio/video; e.g., a comment) |
--- | --- |
Star | |
star | marked as prominent, memorable, retrievable; more than 1 object may be starred (possibly marked for future action) (default = FALSE) |
--- | --- |
Dead | |
dead | Doesn’t respond to user or other input, play a role in the game, get displayed on screen, etc. (default = FALSE) |
WordMask
= masking, hiding, encrypting, or obscuring the visual form of a word, unit, or line of text.
WordMask
is especially relevant for game play; also for quizzes/tutorials.
There are 7 different states for WordMask
.
State | Description |
---|---|
show | Word text is visible, readable, and normal in form (default) |
hide | Word text is invisible |
mask | Word text is covered by a visible place holder |
rune | Word text is transliterated into "runes" (mysterious letters in an exotic alphabet, decodable) |
crypto | Word text is encrypted or randomly substituted with "runes" (mysterious letters) |
dark | Word text is like dark matter: invisible, but it affects nearby objects (with collisions etc.) |
gone | Word text has no effect at all on display |
MaskUnit
= The size of the unit to be masked: word, unit, or line of text.
MaskUnit
is especially relevant for game play; also for quizzes/tutorials.
There are 3 different states for MaskUnit
.
State | Description |
---|---|
word | individual words are masked, preserving white space between them |
unit | each unit as a whole is masked, obscuring the spaces between words (and thus hding the number of words) |
line | each line as a whole is masked, obscuring the spaces between words (and thus hiding the number of words) |
Syntax is a big part of what the Rezonator seeks to represent, since it is built on the theory of Dialogic Syntax (Du Bois 2014). To understand dialogic resonance in everyday interaction, Rezonator seeks to represent how speakers use syntax to create the structure of engagement. When two people coordinate their goals, ideas, and stances, they often coordinate their words and structures as well. By producing two parallel syntactic structures, they implicit invoking a relation between their utterances (dialogic resonance).
To adequately model dialogic syntax, Rezonator must analyze linear syntax as well. (Linear syntax is what most linguists just call syntax. From the dialogic perspective, it is called linear because it typically involves a single linearized sequence of words, organized according to sentence-level syntactic structures (e.g. dependencies). Linear syntax as such does not typically represent mappings between multiple parallel structures; this is the purview of dialogic syntax.)
Among theories of linear syntax, dependency grammar promises to provide the necessary flexibility. Dependency representations offer the flexibility needed to go beyond linear syntax, to represent the dynamic flux of syntactic relations both within and across sentences.
To represent sentence-level syntactic structure (a.ka. linear syntax) in the Santa Barbara Corpus, we anticipate using one or more of the following for tagging and dependency parsing:
The task of modeling language as it is used in naturally occurring dialogic interaction means paying attention to the dynamic structural relations that are constructed in real time by interlocutors. For this, graph theory can be used. Graphs and graph theory are important for creating meaningful representations of both linear syntax (already well developed elsewhere) and dialogic syntax (developed in Rezonator), as well as for higher-level discourse functions including stance alignment, topic structure, rhetorical structure, etc.
In Rezonator, many elements can be thought of as a simple graph. For example, the following are represented mostly by acyclic, directed, non-branching graphs:
- Unit: A sequence of words in a linguistic unit (intonation unit, clause, etc.)
- Line: A sequence of words as displayed in a horizontal row on the screen (usually one unit)
- Sequence: An ordered list of units or lines that make up a discourse
- Chain: A list of words connected vertically, for example, resonating words in parallel utterances
For some purposes, it is useful to combine several simple graphs to create a larger, complex graph:
- Discourse: A Sequence of lines that make up an entire conversation, where each Line contains a Place chain, composed of Words
- Stack: A set of lines (e.g. on a single topic), where each Line contains a Place chain, composed of Words
- Clique: A set of Chains whose fate is intertwined, because they share at least one line
Even the grid (or chess-board) in the main screen can be analyzed as a graph (lattice). Rezonator maps simple and complex graphs (lines, units, chains, etc.) onto the lattice graph.
Graph-related processes are also relevant to understanding the complex dynamics of dialogic resonance. Relevant processes include:
- graph optimization
- graph representation learning
- machine learning on graphs
Rezonator seeks to make the graph structure of both syntax and resonance discourse explicit, in order to support the application of new machine learning tools for representing and quantifying dialogic resonance.
Graph theory equivalents used in Rezonator
Rezonator categories of Word, Link, and Chunk are functionally equivalent to (and can be translated into) the graph theory categories of Node, Edge, and Plate, respectively.
Word = Node (Vertex)
Link = Edge
Chunk = Plate
As one alternative notation, consider plate notation. While the graph theory concepts of Node and Edge are familiar, Plate is less so. For a useful introduction, see the Wikipedia article on plate notation, and this blog on implementing plate notation in Python.
Parameters
For coding a visualization of each relevant Rezonator concept, it is useful to consider the parameters/variables used in the following:
Word = Node
Link = Edge
Box = Plate
For the larger context for these ideas (about notation for words, links, plates), see daft-pgm, especially their API and the PGM object.
For an alternative counter-voice, see Stop using Plate Notation. (But except for the cautions about readability, these objections do not necessarily apply to Rezonator's use of Plate concepts as a visualization tool.)
To represent the full complexity of syntax in spontaneous discourse, something more is needed than the standard representation of one isolated sentence at a time. To fully represent the syntax in dialogic syntax, we anticipate using the Text Annotation Graph notation developed by Forbes et al. (2018a). The Text Annotation Graph notation is especially interesting for its capacity to represent relationships between relationships. For more information, see:
Text Annotation Graphs (paper)
Text Annotation Graphs (on GitHub)
Sample data in JSON (for Text Annotation Graphs)
Demo code (on GitHub)
Creative Coding Lab (Forbes)
References
Forbes, Angus G., Kristine Lee, Gus Hahn-Powell, Marco A. Valenzuela-Escárcega, and Mihai Surdeanu. (2018a) Text Annotation Graphs: Annotating Complex Natural Language Phenomena. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC'18). May 7-12, 2018, Miyazaki, Japan. Ed. by Sara Goggi and Hélène Mazo. European Language Resources Association (ELRA). arXiv preprint arXiv:1711.00529 (2017).
(The TAG software is available at https://github.com/CreativeCodingLab/TextAnnotationGraphs)
Forbes, Angus G., Andrew Burks, Kristine Lee, Xing Li, Pierre Boutillier, Jean Krivine, and Walter Fontana. (2018b) "Dynamic Influence Networks for Rule-based Models." IEEE Transactions on Visualization & Computer Graphics 24(1): 184-194. January, 2018. doi: 10.1109/TVCG.2017.2745280 https://arxiv.org/pdf/1711.00967.pdf
Taking it to the next level, we seek to model the dynamics of dialogic resonance in conversational interaction. One way to do this is through analysis of the causal influence of rules on rules, as when the application of a resonance rule in dialogic syntax influences the application of a rule of linear syntax (e.g. a construction in construction grammar), and vice versa. Another example: dynamic interaction of lexical priming and syntactic priming. For such interactions, a promising model is the visualization of Dynamic Influence Networks (Forbes et al. 2018b), which has been successfully applied to complex biological systems.
We are also find interesting the potential for integrating skip-thought vectors in Rezonator.