Gaurav Kamath, Sebastian Schuster, Sowmya Vajjala and Siva Reddy. In Transactions of the Association for Computational Linguistics. [ArXiv] [TACL]
This repository contains the code and datasets from the paper Scope Ambiguities in Large Language Models. If you have any questions about the paper, code or data, please reach out to the first author at gaurav[dot]kamath[at]mail.mcgill.ca, or create a Github issue.
Note that the dataset used in Experiment 1A of the paper, based on the corpus presented in (AnderBois et al. 2012), is subject to the copyright restrictions of the original corpus presented in that paper. To be accessed, it therefore requires permission from copyright holders of the 2012 corpus: Law School Admission Council and Adrian Brasoveanu. Upon receiving permission, contact the first author to receive all related data.
All other data from the paper is in this repository and may be freely used in accordance with the MIT license this repository is subject to.
datasets
: Final datasets used for obtaining model results.human_results
: Datasets of results from the crowdsourced human experiments run as part of the study.model_results
: Datasets of results from the models tested in the study.scripts
: Scripts and utils used to run experiments and generate data.