RuCo-BERT

This repository contains code and data for training and inference of a new Russian-language coreference resolution model trained on the RuCoCo corpus (see https://github.com/vdobrovolskii/rucoco).

First, to install dependcies run pip install -r requirements.txt. Although the recommended pytorch version for AllenNLP 2.2.0 is 1.8.1, before training it is better to additionally run pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html. This will ensure support of 48 and 80 GB GPUs that is necessary for training with the command below. Coreference resolution is notorious for its extreme computational complexity.

To inference the model, download weights from: https://dl.dropbox.com/s/2m0c4o220pr1rfn/RucocoAncor_rubertb_a150_s20_sw04.tar.gz?dl=0

Then run:
allennlp evaluate --include-package rucoref RucocoAncor_rubertb_a150_s20_sw04.tar.gz data\test.conll --output-file metrics_on_test.json --predictions-output-file predictions.json

In order to train the model from scratch, use the command:
allennlp train --include-package rucoref coref_bertbase_lstm.jsonnet -s output_models.

Results

Results are calculated by default allennlp functions and tend to be slightly lower than the AVG scores computed with reference coreference scorers (https://github.com/conll/reference-coreference-scorers).

Set	PRECISION	RECALL	F1-SCORE
train	96.1	89.3	92.5
development	77.8	72.8	75.2
test	81.1	78.2	79.6

Training process

Development F1 never exceeds the threshold of around 75%, even though training F1 increases steadily.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
data		data
rucoref		rucoref
.gitignore		.gitignore
README.md		README.md
RucocoAncor_rubertb_a150_s20_sw04.log		RucocoAncor_rubertb_a150_s20_sw04.log
arbitrary_logs.txt		arbitrary_logs.txt
coref_bertbase.jsonnet		coref_bertbase.jsonnet
coref_bertbase_lstm.jsonnet		coref_bertbase_lstm.jsonnet
coref_bertlarge.jsonnet		coref_bertlarge.jsonnet
coref_bertlarge_lstm.jsonnet		coref_bertlarge_lstm.jsonnet
metrics_and_loss.csv		metrics_and_loss.csv
metrics_on_test.json		metrics_on_test.json
requirements.txt		requirements.txt
training.jpg		training.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RuCo-BERT

Results

Training process

About

Releases

Packages

Languages

gleb-skobinsky/RuCo-BERT

Folders and files

Latest commit

History

Repository files navigation

RuCo-BERT

Results

Training process

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages