Prepare Protein-Ligand Complexes for Featurization

fabiooliveira-at-uib · August 6, 2020, 8:49am

Hello!
I am trying to construct a new data set of protein-ligand complexes labeled with affinity data. I see that deechem provides tools to featurize the structural information. But how do I prepare the complexes to be featurized? I would be great if you could share what pipelines and open source tools you use to clean the protein complexes before featurization.

For example, how do I go from having a downloaded pdb file from PDB to having a pdb file with the HET residues that I am not interested removed and without the crystalization artifacts.

Thank you!

peastman · August 6, 2020, 4:56pm

Try PDBFixer. It can deal with a lot of the issues that come up in downloaded PDB files. Some manual curation is still needed though. Software can automate a lot of the cleanup, but there’s no substitute for looking at the structure and making sure everything is ok.

fabiooliveira-at-uib · August 7, 2020, 10:00am

Thank you for the suggestion!