As the DeepChem community continues to grow, we need to make it easier to onboard new reviewers. For the last several years, I’ve personally reviewed essentially all DeepChem PRs, which isn’t a sustainable strategy, so I will try to write down the set of checks I do here:
- CI Correctness: Does the PR pass sufficient CI checks? Check other PRs to see what the current stability status of the CI is. If new tests are breaking, something is probably wrong and the PR shouldn’t be merged in.
- Style check: The code should pass lint/flake8 checks. It should match DeepChem naming conventions (CamelCase for classes and snake_style for functions). Code should be 2-spaced.
- Documentation: Documentation should be in numpydoc style. All functions must be documented and have a usage example. New functions or docs should be added to the
docs/
. Make sure doctests pass!
There is a secondary set of considerations to use when evaluating code:
- Don’t break backwards compatibility!: We have worked very hard over the last few years to bring DeepChem up to production stability. Don’t make changes which break backwards compatibility without thinking very hard. If you must, then plan a deprecation cycle which is measured in multiple major releases.
- Is the code DeepChemic?: All code in DeepChem should follow the broad DeepChemic design principles (see Writing DeepChemic Code). Follow the existing classes and use standard data structures.
- Does the code fit with our mission?: Our broad goal is to build a better tool for AI powered science (see Making DeepChem a Better Framework for AI-Driven Science). Does your change work towards this goal? Take a broad minded view here though (few PRs should be rejected since they are out of scope)
- Avoid Adding Dependencies: To the degree possible, avoid adding new dependencies to DeepChem. Each dependency increases instability of DeepChem CI by adding a new source of potential error. If you must add a dependency, make it a soft (non-required) dependency rather than a hard dependency.
- Avoid expanding namespace: Adding a top-level module to deepchem (
dc.X
) should only be done after community discussion and should have broad-buy-in from DeepChem maintainers. Adding a second-level module is mostly fine (dc.X.Y
) if it is justified in the review. - Don’t be afraid of multiple rounds of reviews: Larger features often will require multiple rounds of reviewers, possibly by multiple DeepChem maintainers. Code that touches critical DeepChem datastructures (
dc.data.DiskDataset
,dc.data.GraphData
, etc) should require extra review to avoid breaking code. The CI is a safeguard but is not perfect.