Simple Splitter 🍌

Presentation

simple_splitter is a handy data-splitter that circumvents the some limitations of scikit-learn's train_test_split():

simple_splitter allows for a straightforward, custom train-dev-test splitting. As a matter of fact, you can have as many splits as you want.
simple_splitter handles very tiny data, included stratified-splitting of classes with a single element. The basic idea is the following :
- If there is more or as many examples as splits, make sure each splits has at least one example.
- If there are less examples than splits, prioritise the desired splits (you can set priorities).

Besides, on the contrary to train_test_split(), simple_splitter returns a single list representing the distribution of splits along your data. A matter of choice 😉

Example

If your data is in a pandas.DataFrame, with columns of length len(df), you can easily create your split column with simple_splitter:

from simple_splitter.split import split

# Define you splits :
splits = [('train', 0.65), ('dev1', 0.10), ('dev2', 0.10), ('test', 0.15)]

split_column = split(splits=splits, data_length=len(my_df))

If you need a stratified split, just passed your stratifying column(s) :

split_column = split(data_length=len(my_df), splits=splits, strats=[my_df['my_stratitying_class_column_1'], ['my_stratitying_class_column_2']])

Install

Installation requires no external dependancies.

python -m pip install git+'https://github.com/sven-nm/simple_splitter'

Let's chat !

Any improvements, suggestions and ideas are welcome !

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
simple_splitter		simple_splitter
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple Splitter 🍌

Presentation

Example

Install

Let's chat !

About

Releases 2

Packages

Languages

License

sven-nm/simple_splitter

Folders and files

Latest commit

History

Repository files navigation

Simple Splitter 🍌

Presentation

Example

Install

Let's chat !

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages