Proposal for Updating DeepChem Code Style to Follow 4-space Indents

Background

DeepChem codebase currently uses 2 spaces for indents.

Proposal

Using/Migrating to 4 spaces for indents.

Pros for using 4 space for indent

  • Using 4 spaces for an indent will make DeepChem’s code consistent with the rest of the python community. This will largely help new developers to get started easily with DeepChem’s style guide.
  • When we are using code from another source, we need not reformat it to follow a 2 space indent style.
  • PEP 8 guidelines suggest using 4 spaces for indentation.

Cons for using 4 space for indent

  • Not any I know of.

Migration Steps

Here is a brief overview of steps which can be taken to migrate the code for following 4-space indents:

  • Setup a timeline for migration ~2-3 months
  • When writing a new file, use 4-space indents for new files.
  • For modifying an existing file,
    • We can update as and when each file is modified by a developer during additions/deletions of code, performing bug fixes and other improvements to code (as was done during the yapf version update to 0.32)
    • Chart out phases of updates for large and critical pieces of code, for example files containing the base classes which are important for the code organisation of DeepChem structure.
  • Catch inconsistencies in files during CI workflow (black/yapf can be used to detect inconsistency).

Converting 2-indent to 4-indent spaces:

There are a couple of ways in which a file using 2-space indent can be modified to user 4 space indent:

  1. Yapf - Example: yapf --style='{indent_width: 4}' -i foo.py
  2. Reindent.py - Python’s source code provides a script reindent.py which can be run on files to renindent them to 4 spaces.
  3. Black - Black is an auto-formatting tool, from the Python Software Foundation. We can run black on files to reindent them.

Black is a good choice for using as it enforces strict consistency with PEP-8 guidelines. In the community, Pytorch geometric, Pytorch and Google uses Google style code-formatting implemented via yapf whereas Facebook, Dropbox, Mozilla, Quora use Black to style their code. If we wish to stick with google style formatting, we can use reindent.py/yapf otherwise for sticking to the broader python community, we can use Black for formatting.

Expected Disruption

For a short period of time (during the transition period), the code formatting in the repo across the files will not be consistent but at any time, during the migration process, we will have the style in a file consistent.

Other Thoughts

Python does not require all code in a repository/file to follow the same indentation style. For example, this is a valid piece of code:

def foo():
    print ("Foo has 4 space for indent")

def bar():
  print ("Bar has 2 space for indent")

if __name__ == "__main__":
      print ("Main has 6 spaces of indent")
      foo()
     bar()

But though there are some exceptions, like

 def perm(l):                       # error: first line indented
for i in range(len(l)):             # error: not indented
    s = l[:i] + l[i+1:]
        p = perm(l[:i] + l[i+1:])   # error: unexpected indent
        for x in p:
                r.append(l[i:i+1] + x)
            return r                # error: inconsistent dedent

(source: lexical analysis from python docs).

Indentation is also rejected as inconsistent if a source file mixes tabs and spaces. More on python indentation can be found here.

1 Like

Great proposal! One additional con I would add is:

  • While migration is happening, PRs could be very noisy (every line in a file would change because of indent). This could make reviews harder, so if we decide to proceed we may want to reindent important files early and upfront so future PRs don’t have too much noise

Let’s discuss and get feedback on the developer calls to the proposal after other folks have time to review and comment here

1 Like

When viewing diffs on Github, there’s an option to ignore whitespace changes.

2 Likes