Background
DeepChem codebase currently uses 2 spaces for indents.
Proposal
Using/Migrating to 4 spaces for indents.
Pros for using 4 space for indent
- Using 4 spaces for an indent will make DeepChem’s code consistent with the rest of the python community. This will largely help new developers to get started easily with DeepChem’s style guide.
- When we are using code from another source, we need not reformat it to follow a 2 space indent style.
- PEP 8 guidelines suggest using 4 spaces for indentation.
Cons for using 4 space for indent
- Not any I know of.
Migration Steps
Here is a brief overview of steps which can be taken to migrate the code for following 4-space indents:
- Setup a timeline for migration ~2-3 months
- When writing a new file, use 4-space indents for new files.
- For modifying an existing file,
- We can update as and when each file is modified by a developer during additions/deletions of code, performing bug fixes and other improvements to code (as was done during the yapf version update to 0.32)
- Chart out phases of updates for large and critical pieces of code, for example files containing the base classes which are important for the code organisation of DeepChem structure.
- Catch inconsistencies in files during CI workflow (black/yapf can be used to detect inconsistency).
Converting 2-indent to 4-indent spaces:
There are a couple of ways in which a file using 2-space indent can be modified to user 4 space indent:
- Yapf - Example:
yapf --style='{indent_width: 4}' -i foo.py
- Reindent.py - Python’s source code provides a script reindent.py which can be run on files to renindent them to 4 spaces.
- Black - Black is an auto-formatting tool, from the Python Software Foundation. We can run black on files to reindent them.
Black is a good choice for using as it enforces strict consistency with PEP-8 guidelines. In the community, Pytorch geometric, Pytorch and Google uses Google style code-formatting implemented via yapf
whereas Facebook, Dropbox, Mozilla, Quora use Black
to style their code. If we wish to stick with google style formatting, we can use reindent.py/yapf
otherwise for sticking to the broader python community, we can use Black
for formatting.
Expected Disruption
For a short period of time (during the transition period), the code formatting in the repo across the files will not be consistent but at any time, during the migration process, we will have the style in a file consistent.
Other Thoughts
Python does not require all code in a repository/file to follow the same indentation style. For example, this is a valid piece of code:
def foo():
print ("Foo has 4 space for indent")
def bar():
print ("Bar has 2 space for indent")
if __name__ == "__main__":
print ("Main has 6 spaces of indent")
foo()
bar()
But though there are some exceptions, like
def perm(l): # error: first line indented
for i in range(len(l)): # error: not indented
s = l[:i] + l[i+1:]
p = perm(l[:i] + l[i+1:]) # error: unexpected indent
for x in p:
r.append(l[i:i+1] + x)
return r # error: inconsistent dedent
(source: lexical analysis from python docs).
Indentation is also rejected as inconsistent if a source file mixes tabs and spaces. More on python indentation can be found here.