BBB dataset offered by DeepChem (and MoleculeNet)

I have a question about the version of the Blood–brain barrier penetration (BBBP) dataset offered by DeepChem (and MoleculeNet). I know it contains 2053 compounds. I also know that a more recent BBBP dataset was released three years ago. My question is, why don’t the developers of DeepChem replace the old BBBP dataset with the recent one since it is much larger (it has 7807 compounds)?

Thank you for the note! Yes, we should replace this dataset. We are working on ways to make it easier for community updates to MoleculeNet. I hope to have some news to share in the next few months.

Thank you for your reply. You are welcome, and that’s good to hear!

I am just curious about the reason it hasn’t been added until now. Is it because the new dataset does not meet the quality standards required for DeepChem? Or are there other criteria it has not met to qualify for inclusion in the library? because I checked it and it seems to have some problematic data points.