Low Data Drug Discovery status?

Hi,
I am a medicinal chemist (mainly wet experiment) and got interested in low data drug discovery. I read this thread, it says the method is outdated (?): https://github.com/deepchem/deepchem/issues/1021

Could anyone tell me what is the best up-to-date method for analyzing a small dataset of structure-activity relationships?

I have dataset consisting of 300-2,000 lead compound analogs with potency against various biological targets. I already tested other methods in Deepchem including graph-convolution and it seems to work well but I want to compare with one-shot learning or any updated method for low data analysis.

Thanks!

1 Like

Great question! The one-shot learning method is something we very much want to fix and re-add to DeepChem but we’ve run into some technical challenges there. I hope that we will be able to resume support for it in upcoming DeepChem releases.

At present, I think the best method for analyzing a small dataset (beyond the usual DeepChem graphconv) would be to try newer techniques like ChemBERTa or Grover (CC @seyonec). ChemBERTa will be integrated with DeepChem as it matures and we hope to add grover support as well. Once we fix one-shot, that would be a good technique to try as well.

Thanks, Bharath! I will check out ChemBERTa and Grover. I also look forward to trying the one-shot learning once it is fixed.
Thanks again!

1 Like