Accelerated WEKA unifies the WEKA software, a well-known and open-source Java software, with new technologies that leverage the GPU to shorten the execution time of ML algorithms. It has two benefits aimed at users without expertise in system configuration and coding: an easy installation and a GUI that guides the configuration and execution of the ML tasks. Accelerated WEKA is a collection of packages available for WEKA (e.g., WDL4J, wekaPython, and wekaRAPIDS). Accelerated WEKA can be easiy installed and anyone can extend it to support new tools and algorithms.
Accelerated WEKA was designed to provide an easy installation process. Accelerated WEKA simplifies the installation process by using the conda environment. This makes straightforward to use Accelerated WEKA from the beginning. Once you have conda installed, Accelerated WEKA can be installed by issuing the following two commands:
$ conda create --solver=libmamba -n accelweka -c rapidsai -c conda-forge -c nvidia -c waikato weka
$ conda activate accelweka
Conda takes care of the configuration of dependencies. This means the required libraries will be installed and automatically configured. You do not need to go through any manual setup.
After finishing the installation and activation steps, you can start using Accelerated WEKA immediately by launching the WEKA GUI:
$ weka
The WEKA package is located in:
/path/to/conda/env/pkgs/weka
As most of Weka, AcceleratedWEKA's functionality is accessible in two ways:
- Using the Weka workbench GUI
- Via the commandline interface
Both ways are explained in the getting-started documentation.
A simple example that creates a dataset and runs a Support Vector Machine with it would look like the following:
$ weka -main weka.Run .RandomRBF -n 10000 -a 5000 > RBFa5kn10k.arff
$ weka -memory 12g -main weka.Run weka.classifiers.rapids.CuMLClassifier -split-percentage 80 -learner SVC -t $(pwd)/RBFa5kn10k.arff -py-command python
which results in:
Options: -learner SVC -py-command python
=== Classifier model (full training set) ===
SVC()
Time taken to build model: 24.93 seconds
Time taken to test model on training data: 13.3 seconds
=== Error on training data ===
Correctly Classified Instances 10000 100 %
Incorrectly Classified Instances 0 0 %
Kappa statistic 1
Mean absolute error 0
Root mean squared error 0
Relative absolute error 0 %
Root relative squared error 0 %
Total Number of Instances 10000
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 c0
1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 c1
Weighted Avg. 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000
=== Confusion Matrix ===
a b <-- classified as
5185 0 | a = c0
0 4815 | b = c1
Time taken to test model on test split: 1.13 seconds
=== Error on test split ===
Correctly Classified Instances 2000 100 %
Incorrectly Classified Instances 0 0 %
Kappa statistic 1
Mean absolute error 0
Root mean squared error 0
Relative absolute error 0 %
Root relative squared error 0 %
Total Number of Instances 2000
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 c0
1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 c1
Weighted Avg. 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000
=== Confusion Matrix ===
a b <-- classified as
1041 0 | a = c0
0 959 | b = c1
The full documentation, giving installation instructions and getting started guides, is available at https://waikato.github.io/acceleratedWEKA/.
Original code by Justin Liu