Naive Bayes Classifier agent built in Python 3
Naive Bayes classifier is a probabilistic classifier that uses Bayes' theorem to estimate the classification of a data entity (row of data). It depends on a (naive) assumption of independence between the probability of effects given a cause to estimate the probability of a cause given multiple effects, hence the name Naive Bayes classifier.
-
NaiveBayesClassifier.py
./NaiveBayesCalculator.py
This is the "main" function that runs the program. To run the program enter:$ python NaiveBayesClassifier.py "./absolute/path/to/data.csv" <"./absolute/path/to/estimation_data.csv">
For example, running the command:
$ python NaiveBayesClassifier.py "./data/pima-indians-diabetes.csv"
will run the Naive Bayes agent on the
pima-indians-diabetes.csv
dataset. While this may not do anything interesting, going into the main log, (./logs/main.log
) the last line will display the accuracy of the classifier on the passed dataset. A more interesting example, however:$ python NaiveBayesClassifier.py "./data/primate-factors.csv" "./data/primate-factors-no-class-var.csv"
will create a new
.csv
file,./data/primate-factors-predictions.csv
and a new.txt
file,./data/primate-factors-predictions-about.txt
that describes the unknown dataset with estimated classes and the accuracy of the classifier on the known dataset, respectively.The program assumes data is in the following format:
| AttributeR,C | AttributeR,C | ... | AttributeR,C | Class VariableR |
| :------------: | :------------: | :---: | :------------: | :---------------: |
| Attribute1,1 | Attribute1,2 | ... | Attribute1,N | Class Variable1 |
| Attribute2,1 | Attribute2,2 | ... | Attribute2,N | Class Variable2 |
| ... | ... | ... | ... | ... |
| AttributeN,1 | AttributeN,2 | ... | AttributeN,N | Class VariableN |
- Version 1.0, last modified 10/27/2016 (current)
- Base implementation
- Probability is only estimated on a Normal (Gaussian) Distribution
- Known data is currently split at 67% training, 33% testing