Paper Title: Predicting Volume of Distribution in Humans: Performance of In Silico Methods for a Large Set of Structurally Diverse Clinical Compounds
Summary of DeepChem Usage: DeepChem’s graph convolutional implementation was used as a baseline.
Important Contributions: This work develops ML methods for predicting volume of distribution at steady state, a key pharmacokinetic parameter
Date Published: November 3rd, 2020
DeepChem Papers and Discoveries List
Paper Title: Image based liver toxicity prediction
Summary of DeepChem Usage: Tox21 data is sourced from DeepChem for this work.
Important Contributions: The authors develop a toxicity prediction method that uses 3D conformations of input molecules.
Date Published: January 24, 2020
Paper Title: Practical Model Selection for Prospective Virtual Screening
Summary of DeepChem Usage: Used the DeepChem IRV implementation.
Important Contributions: Demonstrated accurate prospective predictions of inhibitors of a bacterial protein-protein interaction.
Date Published: November 30, 2018
Paper Title: Solving the RNA design problem with reinforcement learning
Summary of DeepChem Usage: Uses the DeepChem reinforcement learning classes.
Important Contributions: Demonstrates that RL can be effective at RNA design.
Date Published: June 21, 2018
Paper Title: Predicting Toxicity from Gene Expression with Neural Networks
Summary of DeepChem Usage: Model is built with DeepChem.
Important Contributions: Demonstrates accurate prediction of toxicity based on gene expression profiles of either cultured cells or live animals.
Date Published: January 31, 2019
Paper Title: Predicting Gene Expression Between Species with Neural Networks
Summary of DeepChem Usage: Model is built with DeepChem.
Important Contributions: Given a gene expression profile from rat cells treated with a drug, predicts the expression profile for human cells treated with the same drug.
Date Published: July 5, 2019
Paper Title: Transformer Based Molecule Encoding for Property Prediction
Summary of DeepChem Usage: DeepChem models are used to baseline authors new model on datasets from MoleculeNet
Important Contributions: “We build a Transformer-based molecule encoder and property predictor network with novel input featurization that performs significantly better than existing methods. We adapt our model to semi-supervised learning to further perform well on the limited
experimental data usually available in practice.” Notably, this paper is from SRI
Date Published: November 5th, 2020 (Arxiv)
Paper Title: Prediction of energies for reaction intermediates and transition states on catalyst surfaces using graph-based machine learning models
Summary of DeepChem Usage: DeepChem weave/graph-conv models were used to implement graph ML models for this work.
Important Contributions: “Computational studies of heterogeneous catalysis processes depend on massive electronic structure calculations to obtain the energies of intermediates and transition states. To speed up this process, several machine-learning-based methods were proposed for the prediction of surface species energies. Here we developed a new method to represent all surface species with molecular graph, a data structure which is easy to read and extendable, but seldom utilized in catalysis studies.”
Date Published: December 2020
Paper Title: Transformer-CNN: Swiss knife for QSAR modeling and interpretation
Summary of DeepChem Usage: DeepChem TextCNN was used as basis for 1D convolution filters for authors’ architecture.
Important Contributions: The authors propose a new Transformer-CNN architecture for QSAR use cases. Code at https://github.com/bigchem/transformer-cnn
Date Published: March 18, 2020
Paper Title: 3D matters! 3D-RISM and 3D convolutional neural network for accurate bioaccumulation prediction
Summary of DeepChem Usage: DeepChem graph convolutions are used as a baseline
Important Contributions: “In this work, we present a new method for predicting complex physical-chemical properties of organic molecules. The approach utilizes 3D convolutional neural network (ActivNet4) that uses solvent spatial distributions around solutes as input. These spatial distributions are obtained by a molecular theory called three-dimensional reference interaction site model. We have shown that the method allows one to achieve a good accuracy of prediction of bioconcentration factor which is difficult to predict by direct application of methods of molecular theory or simulations.”
Date Published: July 19th, 2018
Paper Title: DeepMalaria: artificial intelligence driven discovery of potent antiplasmodials
Summary of DeepChem Usage: DeepChem’s GCN was used to perform the research described here
Important Contributions: " In this work, we introduce DeepMalaria, a deep-learning based process capable of predicting the anti- Plasmodium falciparum inhibitory properties of compounds using their SMILES. A graph-based model is trained on 13,446 publicly available antiplasmodial hit compounds from GlaxoSmithKline (GSK) dataset that are currently being used to find novel drug candidates for malaria… To validate the DeepMalaria generated hits, we used a commonly used SYBR Green I fluorescence assay based phenotypic screening. DeepMalaria was able to detect all the compounds with nanomolar activity and 87.5% of the compounds with greater than 50% inhibition."
Date Published: January 15th, 2020
Paper Title: SMILES Transformer: Pre-trained Molecular Fingerprint for Low Data Drug Discovery
Summary of DeepChem Usage: DeepChem’s graph convolutional implementation was used as a baseline.
Important Contributions: “In this paper, we propose SMILES-Transformer, a data-driven molecular fingerprint produced by a Transformer-based seq2seq pre-trained with a huge set of unlabeled SMILES. ST fingerprints were shown to work well with any predictive model in MoleculeNet downstream tasks and is effective especially when there is not enough labeled data. When large labeled data are available, ST fingerprints work comparable to other state-of-the-art baselines such as GraphConv. We also propose DEM, a novel metric for data efficiency. In terms of DEM, the ST fingerprint is better than existing methods in 5 out of 10 downstream tasks.”
Date Published: November 12th, 2019
Paper Title: Androgen Receptor Binding Category Prediction with Deep Neural Networks and Structure-, Ligand-, and Statistically-Based Features
Summary of DeepChem Usage: DeepChem’s DNN, RF, and CNN were built with user-specified featurization, as well as other featurizations as comparisons
Important Contributions: “In this paper, binding categories of androgen receptor chemical compounds were calculated using a variety of structure-based features on the chimp, human, and rat proteins, in addition to ECFP fingerprints, Bayesians, and other features. These performed better than the cddd featurizations, CNN, and the same features in a MLogR, as well as overtraining with RF.”
Date Published: February 26, 2021
Paper Title: Androgen Regulates SARS-CoV-2 Receptor Levels and Is Associated with Severe COVID-19 Symptoms in Men
Summary of DeepChem Usage: DeepChem is used to perform virtual high throughput screening, using ECFP fingerprints and GraphConvModel
.
Important Contributions: “SARS-CoV-2 infection has led to a global health crisis, and yet our understanding of the disease and potential treatment options remains limited. The infection occurs through binding of the virus with angiotensin converting enzyme 2 (ACE2) on the cell membrane. Here, we established a screening strategy to identify drugs that reduce ACE2 levels in human embryonic stem cell (hESC)-derived cardiac cells and lung organoids.”
Date Published: November 17, 2020
Journal: Cell Stem Cell
Paper Title: CheMixNet: Mixed DNN Architectures for Predicting Chemical Properties using Multiple Molecular Representations
Summary of DeepChem Usage: DeepChem is used to implement the graph convolutional baselines.
Important Contributions: “In this work, we present CheMixNet- a set of neural networks for predicting chemical properties from a mixture of features learned from the two molecular representations - SMILES as sequences and molecular fingerprints as vector inputs.”
Date Published: November 30th, 2018
Journal: Arxiv Preprint
Paper Title: An Integrated Transfer Learning and Multitask Learning Approach for Pharmacokinetic Parameter Prediction
Summary of DeepChem Usage: The PCBA, MUV, and Tox21 datasets from moleculenet were used to train models.
Important Contributions: This study aims to construct an integrated transfer learning and multitask learning approach for developing quantitative structure–activity relationship models to predict four human pharmacokinetic parameters.
Date Published: December 20, 2018
Journal: ACS Molecular Pharmaceutics
Paper Title: GPCR_LigandClassify.py; a rigorous machine learning classifier for GPCR targeting compounds
Summary of DeepChem Usage: Code is available at https://github.com/mmagithub/GPCR_LigandClassify and uses DeepChem 1.x and uses DeepChem for RdkitDescriptor
listings and data loading.
Date Published: May 4th, 2021
Journal: Nature Scientific Reports
Paper Title: Bioactivity descriptors for uncharacterized chemical compounds
Summary of DeepChem Usage: MoleculeNet benchmarks are used to test the proposed bioactivity descriptors
Important Contributions: This work demonstrates that bioactivity descriptors which catalogue the activity of a compound across a broad range of assays can have powerful downstream predictive power by benchmarking across MoleculeNet datasets.
Date Published: June 2021
Journal: Nature Communications
Paper Title: A general optimization protocol for molecular property prediction using a deep learning network
Summary of DeepChem Usage: MoleculeNet and GraphConvModel are used for benchmarking.
Important Contributions: The use of bayesian optimization for improving model performance is explored.
Date Published: September 2021
Journal: Briefings in Bioinformatics
Paper Title: Machine Learning the Redox Potentials of Phenazine Derivatives: A Comparative Study on Molecular Features
Summary of DeepChem Usage: DeepChem is used to generate fingerprints used in this study.
Important Contributions: The use of DeepChem and fingerprints is explored in optimizing redox flow batteries
Date Published: CSIR-National Chemical Laboratory Pune
Journal: Chemrxiv