Deepchem on M1 MacBook

ignaczgerg · April 6, 2021, 11:00am

Dear All,

I am terrible sorry if this issue has already come up but I could not find it on the forum. I am also sorry that this is not a relevant topic to post here but I am stuck.

I am trying to install deepchem via Conda on a MacBook with the M1 chipset (BigSur). I am installing deepchem into a Conda environment with python 3.7.9. The installation proceeds correctly but after I want to run ‘import deepchem’ command, the kernel dies.
The same ‘import deepchem’ command can run (deepchem: 2.3.0 with python 3.7.9) on my windows pc and I can use the package correctly and I also can use deepchem on my old iMac as well.

Does anyone might know what could be the problem?

Thanks,
Gergo

bharath · April 6, 2021, 7:49pm

Ah, it’s a M1! Ok, that makes sense to me on why the kernel might die. DeepChem depends on a number of numeric packages and my understanding is that everything hasn’t been ported to work on the M1 (since the M1 isn’t using an Intel chip). For now, I’d suggest just using colab or a cloud environment until the M1 numerics situation is fixed by the broader community

CC @peastman who might know more possibly

ignaczgerg · April 27, 2021, 10:59am

I found a solution which only works with the nightly version but not with the stable one.
The actual problem was that rdkit was not supported on the M1 Mac because of pycairo. Apparently, they removed this dependency (https://github.com/conda-forge/rdkit-feedstock/pull/72) and now rdkit (at least higher builds) should be compatible with arm64. Before, only the custom builds were able to run on M1 Mac (https://github.com/conda-forge/rdkit-feedstock/issues/63)

First, we need to install Miniforge3 (https://github.com/conda-forge/miniforge#miniforge3) for arm64 architecture. After this, we install TensorFlow following this article (https://medium.com/codex/installing-tensorflow-on-m1-macs-958767a7a4b3). If everything goes right, rdkit could be installed into the same environment as tensorflow:

conda config --add channels conda-forge
conda config --set channel_priority strict
conda install rdkit rdkit-dev

As the final step, we install deepchem nightly:

pip install --pre deepchem

I run a quick test using Tutorial 4 and apart from a few warnings, it run okay:

In [2]: import deepchem as dc
dc.__version__

Out [2]: '2.6.0.dev'

In [3]: tasks, datasets, transformers = dc.molnet.load_tox21(featurizer='ECFP')
train_dataset, valid_dataset, test_dataset = datasets
print(train_dataset)
Out [3]: RDKit WARNING: [13:26:39] WARNING: not removing hydrogen atom without neighbors RDKit WARNING: [13:26:48] WARNING: not removing hydrogen atom without neighbors

<DiskDataset X.shape: (6264, 1024), y.shape: (6264, 12), w.shape: (6264, 12), task_names: ['NR-AR' 'NR-AR-LBD' 'NR-AhR' ... 'SR-HSE' 'SR-MMP' 'SR-p53']>

In [4]: train_dataset.w

Out [4]: array([[1.04502242, 1.03632599, 1.12502653, ..., 1.05576503, 1.17464996,
        1.05288369],
       [1.04502242, 1.03632599, 1.12502653, ..., 1.05576503, 1.17464996,
        1.05288369],
       [1.04502242, 1.03632599, 1.12502653, ..., 1.05576503, 0.        ,
        1.05288369],
       ...,
       [1.04502242, 0.        , 1.12502653, ..., 1.05576503, 6.7257384 ,
        1.05288369],
       [1.04502242, 1.03632599, 1.12502653, ..., 1.05576503, 6.7257384 ,
        1.05288369],
       [1.04502242, 1.03632599, 1.12502653, ..., 0.        , 1.17464996,
        1.05288369]])

In [5]: model = dc.models.MultitaskClassifier(n_tasks=12, n_features=1024, layer_sizes=[1000])

In [6]: import numpy as np
model.fit(train_dataset, nb_epoch=10)
metric = dc.metrics.Metric(dc.metrics.roc_auc_score)
print('training set score:', model.evaluate(train_dataset, [metric], transformers))
print('test set score:', model.evaluate(test_dataset, [metric], transformers))

Out [6]: WARNING:tensorflow:AutoGraph could not transform <function KerasModel._create_gradient_fn.<locals>.apply_gradient_for_batch at 0x13de945e0> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module 'gast' has no attribute 'Index'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <function KerasModel._create_gradient_fn.<locals>.apply_gradient_for_batch at 0x13de945e0> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module 'gast' has no attribute 'Index'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
training set score: {'roc_auc_score': 0.947299617865711}
test set score: {'roc_auc_score': 0.6867514307999948}

I will do some further testing and give an update if I find something. I hope this might help someone.

Cheers!

juliusgeo · April 29, 2022, 5:03am

Hi! I was wondering if you had any tips for my install script that I posted here/could test it out on your machine: DeepChem M1 Mac Support

ignaczgerg · October 9, 2022, 12:16pm

Hi, sorry for the late response. I just checked the install script now, nice work! I tested on my machine as well (M1 Pro, monterey). The build finished with a few failed graph tests. Nonetheless, I think it is fine.