Skip to content

Latest commit

 

History

History
66 lines (49 loc) · 2.25 KB

README.md

File metadata and controls

66 lines (49 loc) · 2.25 KB

Synthetic Dataset Generator

Build status Coverage Status Pharo version Pharo version Pharo version Pharo version License

This project is only a draft. It is still incomplete :) We need to continue working on it.

How to install it

To install data-generator, go to the Playground (Ctrl+OW) in your Pharo image and execute the following Metacello script (select it and press Do-it button or Ctrl+D):

Metacello new
  baseline: 'AIDataGenerator';
  repository: 'github://pharo-ai/data-generator';
  load.

How to depend on it

If you want to add a dependency on data-generator to your project, include the following lines into your baseline method:

spec
  baseline: 'AIDataGenerator'
  with: [ spec repository: 'github://pharo-ai/data-generator' ].

How to use it

generator := AIDataGenerator new.
data := generator generateRows: 100 columns: 10.

Idea for the future

This is a silly pseudocode. Just to demonstrate the idea.

generator := AIDataGenerator new.

generator
    addIntegerColumnNamed: 'weight'
    inRangeBetween: 40
    and: 100
    distributedUsing: PMNormalDistribution new.

generator
    addFloatColumnNamed: 'salary'
    inRangeBetween: 1000
    and: 5000
    distributedUsing: (PMExponentialDistribution mu: 2000 sigma: 0.5).

generator
    addCategoricalColumnNamed: 'gender'
    withValues: #(male female)
    distributedUsing: PMUniformDistribution new.

dataFrame := generator generateDatasetWithRows: 10000.