[WIP] Add easy model fitting and comparison #77

prateekdesai04 · 2024-09-06T21:47:56Z

Description of changes:
This PR adds plotting functionality comparing the TabRepo configs vs AG fitted models.
Following are the changes compared to the earlier PR, i.e., : #76
The convert_leaderboard_to_configs() is modified and cleaned up, the earlier one renamed columns which were not present in the DataFrame.
Addition of plot_overall_rank_comparison() which plots various figures for all the models in the DataFrame i.e., (fitted models + TabRepo configs)

NOTE that in the earlier #76, there is a fold mismatch in purpose to test the functionality of compare_metrics(), but in this PR temp_script.py has the same folds for both fitted models and TabRepo configs, this is done just to keep the folds same while plotting.
The plot function breaks when ELO figures are plotted and the code will give an error, still a WIP, but rest of the plots can be found in initial_experiment/output/figures, the code runs up-to that mark.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

geoalgo · 2024-09-20T13:37:53Z

tabrepo/repository/abstract_repository.py

@@ -532,3 +536,100 @@ def _convert_binary_to_multiclass(self, predictions: np.ndarray, dataset: str) -
            return np.stack([1 - predictions, predictions], axis=predictions.ndim)
        else:
            return predictions
+
+    def _convert_time_infer_s_from_sample_to_batch(self, df: pd.DataFrame):


Can we move those functions to an util? Those are very specific function regarding to one analysis, if we all our utils in AbstractRepository, it will be very hard to maintain.

Yeah, a lot of this code is currently WIP. Basically, we are prioritizing making it work first, and then once it works we will do clean up

geoalgo · 2024-09-20T13:38:10Z

examples/context_dl.py

+    """
+    Class to Fetch Train Test Splits of context dataset
+    """
+    def get_context_train_test_split(self, repo: EvaluationRepository, task_id: Union[int, List[int]], repeat: int = 0,


Nice to get those!

prateekdesai04 · 2024-10-03T22:41:31Z

tabrepo/scripts_v6/TabForestPFN_class.py

+from sklearn.utils.multiclass import unique_labels
+
+
+class TabForestPFN_sklearn(BaseEstimator, ClassifierMixin):


Have not touched this class as it still seems to be a WIP

Innixma force-pushed the wip branch from 06a2ca2 to 2cf4511 Compare September 20, 2024 01:07

geoalgo reviewed Sep 20, 2024

View reviewed changes

prateekdesai04 commented Oct 3, 2024

View reviewed changes

Innixma changed the title ~~[WIP] Adding plotting functionality~~ [WIP] Add easy model fitting and comparison Oct 9, 2024

Ubuntu and others added 26 commits October 17, 2024 21:49

adding test scripts

64ed36d

matching tabrepo and fit df, using zeroshot_context

851bfb0

plotting functionality

b0b2552

Update

ef2aa9a

WIP exec.py

275df36

Add updates

2ec9b6e

Add v2 scripts

66ea368

Remove y_uncleaned

2c5b3c3

resolve merge conflicts

3b3f791

resolve merge conflicts

a1df0a4

resolve merge conflicts

022fc3f

adding test scripts

6ab5304

plotting functionality

fd1d0a9

Initial Class implementation

a411f5e

typo

7227ab2

minor updates

08b266c

add run_scripts_v4

095ceed

making run_experiment a staticmethod

41b098e

Updated run_experiments

1ef8070

Cleanup, add TabPFNv2 prototype

8b25bac

Cleanup

a69596d

Cleanup

f5fe3c7

Cleanup

b95a76e

Cleanup

f8b8da4

Cleanup

74df85f

bug fix

8f62e02

Innixma and others added 30 commits October 17, 2024 23:17

Add extra unit tests for repo save/load

92c0cf7

Fix Self import

9b65c66

Fix imports

bc7e3d7

fix tests

4099d06

simplify run_quickstart_from_scratch.py

248e9cf

minor update

d4e8b59

update repo.from_raw

1f827e1

Add root, app and console loggers

56016bf

addition to logging module

9003e1a

add context save/load with json + relative path support

f1abdb1

add ebm and tabpfnv2 models

0102ac4

add ebm and tabpfnv2 models

16c0329

update

1474e65

update

315aa99

update

b5d2838

update

2b962fb

update

b9380dc

Support loading repo artifact from cloned directory

3b0a932

minor fix

c5c69a6

cleanup

2f8df7e

update

afdb8b9

Update

6d22d4d

cleanup

e7390a9

Add simple benchmark runner

d0484d8

cleanup

798219a

Update for ag12

7440051

Update for ag12

1393238

Update for ag12

8c78273

TabPFN support stopped at best epoch

6223f8c

update

f65c4ef

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add easy model fitting and comparison #77

[WIP] Add easy model fitting and comparison #77

prateekdesai04 commented Sep 6, 2024

geoalgo Sep 20, 2024

Innixma Sep 20, 2024

geoalgo Sep 20, 2024

prateekdesai04 Oct 3, 2024

		from sklearn.utils.multiclass import unique_labels


		class TabForestPFN_sklearn(BaseEstimator, ClassifierMixin):

[WIP] Add easy model fitting and comparison #77

Are you sure you want to change the base?

[WIP] Add easy model fitting and comparison #77

Conversation

prateekdesai04 commented Sep 6, 2024

geoalgo Sep 20, 2024

Choose a reason for hiding this comment

Innixma Sep 20, 2024

Choose a reason for hiding this comment

geoalgo Sep 20, 2024

Choose a reason for hiding this comment

prateekdesai04 Oct 3, 2024

Choose a reason for hiding this comment