Naive-Bayes-text-categorizer

This program accepts training and testing data to train and test a Naive Bayes classifier for text categorization. The program is written in C++, and was designed as part of the ECE467 Natural Language Processing course at the Cooper Union. Unigrams (words) are used as the individual features for the Naive Bayes classifier. Uppercase letters are converted to lowercase, and all punctuation, save for the hyphen, is discarded. To facilitate the classifier, document counts (rather than word counts) are used to estimate unigram probabilities. Laplace smoothing is used to account for novel words in the training set. Log probabilities are used to facilitate computation.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
nlp1 writeup.docx		nlp1 writeup.docx
nlp1.cpp		nlp1.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Naive-Bayes-text-categorizer

About

Releases

Packages

Languages

Abi1024/Naive-Bayes-text-categorizer

Folders and files

Latest commit

History

Repository files navigation

Naive-Bayes-text-categorizer

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages