Model.evaluate() method throws and IndexError

Hi there, I am trying to evaluate the GraphConv Model using

metric = dc.metrics.Metric(dc.metrics.roc_auc_score, np.mean, mode=“classification”)
train_scores = model.evaluate(train_dataset, [metric])

but am getting an “IndexError: index 123 is out of bounds for axis 1 with size 2”.
I copied the code below. Does anybody have an idea how to fix that?

File “/Users/ingrid/.conda/envs/ml-compound-clustering/lib/python3.7/site-packages/deepchem/metrics/ init .py”, line 30, in to_one_hot
y_hot[np.arange(n_samples), y.astype(np.int64)] = 1
IndexError: index 123 is out of bounds for axis 1 with size 2

import numpy as np
import matplotlib.pyplot as plt
import deepchem as dc
from rdkit import Chem
from rdkit.Chem import Draw
import pandas as pd
from itertools import islice
from IPython.display import Image, display
from sklearn.model_selection import train_test_split
from deepchem.models import GraphConvModel
from import load_from_disk

Load dataset

dataset_file = ‘test.csv’


featurizer = dc.feat.ConvMolFeaturizer()

Convert data into machine learning suitable data object

loader =[“task1”], smiles_field=“smiles”, featurizer=featurizer)
dataset = loader.featurize(dataset_file)

Initialize transformer

#transformer = dc.trans.NormalizationTransformer(transform_w=True, dataset=dataset)
#dataset = transformer.transform(dataset)

Define train, validation and test data sets

splitter = dc.splits.ScaffoldSplitter(‘test.csv’)
train_dataset, valid_dataset, test_dataset = splitter.train_valid_test_split(dataset)

model = GraphConvModel(n_tasks=1, mode=‘classification’, batch_size=50, n_classes=125), nb_epoch=100)
metric = dc.metrics.Metric(dc.metrics.roc_auc_score, np.mean, mode=“classification”)
train_scores = model.evaluate(train_dataset, [metric])

1 Like

Hmm, I think I’ve seen this error before. A couple of questions:

  • Is your model multiclass and not just binary?. Multiclass classifiers had some errors in metric handling. I have a PR up in review which fixes this that I can push through if this is your error
  • Which version of deepchem are you on? If you’re on 2.3.0, I’d recommend updating to the nightly build (we’ve been refactoring a lot, and the new codebase is more stable. We need a couple critical bugfixes before 2.4.0 though). You can do this with pip install --pre deepchem

Thank your for your quick response! Yes, it’s multiclass and not binary and I am on 2.3.0. I will delete the old version and install the nightly build. pip install --pre deepchem. Let you know if that solved the problem

1 Like

Updating to pre deepchem didn’t solve the problem. Still getting an IndexError. Is there an alternative to do the model evaluation for example with Scikit-learn?

File “/Users/ingrid/.conda/envs/ml-compound-clustering/lib/python3.7/site-packages/deepchem/metrics/”, line 30, in to_one_hot
y: np.ndarray
IndexError: index 123 is out of bounds for axis 1 with size 2

1 Like

Ah yes, my apologies, this is due to our multiclass bug. Let me get that PR merged in this week and I’ll report back on this thread!

You can explicitly evaluate with sklearn in the meanwhile though. The trickiest part here is that sklearn multiclass metrics require you to threshold your predictions for multiclass. Here’s a utility function pulled from the fix PR (

def threshold_predictions(y, threshold=0.5):
  """Threshold predictions from classification model.
  y: np.ndarray
    Must have shape `(N, n_classes)` and be class probabilities.
  threshold: float, optional (Default 0.5)
    The threshold probability for the positive class. Note that this
    threshold will only be applied for binary classifiers (where
    `n_classes==2`). If specified for multiclass problems, will be
  y_out: np.ndarray
    Of shape `(N,)` with class predictions as integers ranging from 0
    to `n_classes-1`.
  if not isinstance(y, np.ndarray) or not len(y.shape) == 2:
    raise ValueError("y must be a ndarray of shape (N, n_classes)")
  N = y.shape[0]
  n_classes = y.shape[1]
  if not np.allclose(np.sum(y, axis=1), np.ones(N)):
    raise ValueError(
        "y must be a class probability matrix with rows summing to 1.")
  if n_classes != 2:
    y_out = np.argmax(y, axis=1)
    return y_out
    y_out = np.where(y[:, 1] >= threshold, np.ones(N), np.zeros(N))
    return y_out

With this in place you can do

y_pred = threshold_predictions(model.predict(valid))
y_true = valid.y
print(sklearn.metrics.accuracy_score(y_true, y_pred))

I’ve just merged in the fix PR! If you install the nightly build of deepchem (pip install --pre deepchem) you should be able to use the latest version and give it a try. Let me know if you’re still running into errors. This was a big set of changes and it’s possible there are still bugs

1 Like