I’m not sure if this is the right forum to ask this question so please delete it if it’s inappropriate.
I’m currently working on some experimental data and I would like to split it into train, valid, and test sets.
My code looks something like this:
from deepchem.splits.splitters import RandomSplitter
split_dataset = RandomSplitter()
train, val, test = split_dataset.split(dataset=balanced_dataset)
Which returns me 3 arrays corresponding to an 80/10/10 split.
How do I use these indices to split my DiskDataset according to the specified arrays?