Code for building and analysing two different Twitter corpora:
- The Reo Māori Twitter (RMT) Corpus V2 comprising 94,163 Māori-language tweets (updated 12 November 2022). The first version of the corpus (reported in our paper) contained 79,018 tweets, and is still available for download.
- The Māori Loanword Twitter (MLT) Corpus containing 4.5 million tweets of Māori loanwords in New Zealand English (NZE).
Code relating to the Māori-English Twitter (MET) Corpus is hosted on another GitHub repo.
For further information, please visit our companion website.