Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Here is a first draft for
KFold
.At initialization,
KFold
takes one or severalCapsDataset
(e.g.KFold(dataset)
orKFold(dataset_0, dataset_1)
).The public methods are:
make_splits
: to perform the splitting operation. This method should do the job ofclinicadl kfold
;read
: to get the splits from a split directory;write
: aftermake_splits
has been called, to write the splits in a split directory;get_splits
: returns an iterator of the wanted splits (e.g.kfold.get_splits(splits=[0, 3])
).The splits are returned as a
Split
object that contains all the relevant information for training, including the training and validation datasets. If multiple datasets have been passed at the initialization of theKFold
object, a tuple ofSplit
objects is returned (splits are arranged according to the order of input datasets).To build the associated training and validation dataloaders, the user must call the
build_train_loader
andbuild_val_loader
methods of theSplit
objects. These methods accept common dataloader parameters (batch_size
,shuffle
, etc.). I chose to separate these two methods because the user may want different parameters for his/her training and validation loaders.I also felt the need to refactor the
get_dataloader
andgenerate_sampler
functions.Beware that it is a draft and nothing is yet tested.