Trends in Natural Language Processing

This repository depicts the code used for our analysis of Natural Language Processing as presented in the work:

"Analyzing a Decade of Evolution: Trends in Natural Language Processing"

Setup

The necessary requirements.txt file is provided for the necessary python depnedancies.

Dataset Pre-Processing

The dataset containing the various corpora is provided on Zenodo, alternatively users can run our pipeline to create the corpus:

Download papers: In order to download the necessary papers used in this work, we provide the download_from_acl.sh bash script.
Parse PDF: In order to convert the PDFs into human readable text, we provide the pdf_to_json.py file which converts the PDFs to parsed JSON representation.

Data Extraction

In order to extract the various GPU information, we provide the gpu_extraction_1.py file, which will create some of the files present in the output directory, using the JSON representations presented above.

We also proved our script used to query ChatGPT, however this is likely outdated due to changes in API openai.py.

Finally we also provide code to query citation information for the papers in both paper_citations_from_scholar.py and paper_citations_from_semantic.py.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
output		output
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
download_from_acl.sh		download_from_acl.sh
download_papers_from_aclanthology.py		download_papers_from_aclanthology.py
gpu_extraction_1.py		gpu_extraction_1.py
gpu_identifier_1_v1.py		gpu_identifier_1_v1.py
gpus_v3.csv		gpus_v3.csv
openai.py		openai.py
paper_citations_from_scholar.py		paper_citations_from_scholar.py
paper_citations_from_semantic.py		paper_citations_from_semantic.py
pdf_reader_text.py		pdf_reader_text.py
pdf_to_json.py		pdf_to_json.py
requirenents.txt		requirenents.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trends in Natural Language Processing

Setup

Dataset Pre-Processing

Data Extraction

About

Releases

Packages

Contributors 2

Languages

License

ieeta-pt/nlp-trends

Folders and files

Latest commit

History

Repository files navigation

Trends in Natural Language Processing

Setup

Dataset Pre-Processing

Data Extraction

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages