This is our code for the paper Geometry-Aware Supertagging with Heterogeneous Dynamic Convolutions.
While unpublished, you can cite the arxiv preprint:
@misc{https://doi.org/10.48550/arxiv.2203.12235,
doi = {10.48550/ARXIV.2203.12235},
url = {https://arxiv.org/abs/2203.12235},
author = {Kogkalidis, Konstantinos and Moortgat, Michael},
keywords = {Computation and Language (cs.CL), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Geometry-Aware Supertagging with Heterogeneous Dynamic Convolutions},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
The model presents a new approach to constructive supertagging, based on an explicit graph representation that accounts for both the intra-tree interactions between a single supertag and the inter-tree interactions between (partially decoded) supertag sequences. To account for the disparity between the various modalities in the graph (i.e. sentential word order, subword contextualized vectors, tree-sequence order and intra-tree edges) we adopt a heterogeneous formulation. Decoding is performed in parallel over trees, each temporal step associated with an increased tree depth. Statefulness is achieved by representing each partially decoded tree with a single state-tracking vector, which is updated twice at each step: once with feedback from its own tree's last decoded fringe, and once with feedback from surrounding trees. The result is a highly parallel yet partially auto-regressive architecture with input-scaling memory complexity and near-constant decoding time that achieves new state-of-the-art scores on four datasets while retaining the ability to predict rare supertags reliably.
Averages of 6 repetitions compared to recent results on the respective datasets (crawled 23/03/2022).
Model | Accuracy | Frequent (100+) | Uncommon (10-99) | Rare (1-9) | OOV |
CCGbank | |||||
Attentive Convolutions | 96.25 | 96.64 | 71.04 | n/a | n/a |
Ours | 96.29 | 96.61 | 72.06 | 34.45 | 4.55 |
CCGrebank | |||||
Recursive Tree Addressing | 94.70 | 95.11 | 68.86 | 36.76 | 4.94 |
Ours | 95.07 | 95.45 | 71.06 | 34.45 | 4.55 |
French TLGbank | |||||
ELMO & LSTM | 93.20 | 95.10 | 75.19 | 25.85 | n/a |
Ours | 95.92 | 96.40 | 81.48 | 55.37 | 7.25 |
Æthel | |||||
Symbol-Sequential Transformer | 83.67 | 84.55 | 64.70 | 50.58 | 24.55 |
Ours | 93.67 | 94.83 | 73.45 | 53.83 | 15.79 |
dyngraphst.neural
contains the model architecture , and dyngraphst.data
contains the data preprocessing code; see
the READMEs of the respective directories for more details.
While anticipating the next stable release of PyTorchGeometric,
the code will require two distinct python environments: python3.9 with pytorch 1.10.2
and
torch_geometric 2.0.3
for training/inference, and python3.10 for data processing and evaluation.
The two will be coallesced at a later stage.
Detailed instructions coming soon.
If you have any questions or comments or would like a grammar/language specific pretrained model, feel free to get in touch.