This repository is a pytorch implementation of a learning framework for implementing different models for the neural abstractive text summarization and beyond. It is an extension of NATS toolkit, which is a toolkit for Neural Abstractive Text Summarization. The goal of this framework is to make it convinient to try out new ideas in abstractive text summarization and other language generation tasks.
Live System Demo http://dmkdt3.cs.vt.edu/leafNATS/
- glob
- argparse
- shutil
- spacy
- pytorch 1.0
We tested different models in LeafNATS on the following datasets. Here, we provide the link to CNN/Daily Mail dataset and data processing codes for Newsroom and Bytecup2018 datasets. The preprocessed data will be available upon request.
In the dataset, <s> and </s> is used to separate sentences. <sec> is used to separate summaries and articles. We did not use the json format because it takes more space and be difficult to transfer between servers.
LeafNATS is current under development. A simple way to run models that have already implemented is
-
Check:
Go to playground to check models we have implemented. -
Import:
In run.py, import the example you want to try. -
Training:
python run.py -
Validate:
python run.py --task validate -
Test:
python run.py --task beam -
Rouge:
python run.py --task rouge
Engine
Training frameworksPlayground
Models, pipelines, loss functions, and data redirectionModules
Building blocks, beam search, word-copy for decodingData
Data pre-process and batcher.
Here is the pretrained model for our live system https://drive.google.com/open?id=1A7ODPpermwIHeRrnqvalT5zpr4BCTBi9