Perhaps we could use GSoC as the opportunity to dedicate attention to the protein engineering side? Thinking of implementing papers like UniRep, or recent promising proteins + transformer work. Just a shot in the dark idea!
To add on one idea, I’d love to see better support for semiconductor design in DeepChem. This new paper uses graph neural networks to work towards designing new semiconductors:
I’ve been writing a series of articles about the semiconductor industry which might be a good source of background information if you’re interested in this topic: https://deepforest.substack.com/
I’ve discussed this with a few people, but one idea would be to create a Model Hub like the one Hugging Face provides: https://huggingface.co/models
A scientific deep learning model hub would be very powerful I think. It could be “seeded” with a few of DeepChem’s most popular models and then users could upload new models and use them for transfer learning. This could be integrated with MoleculeNet later on as well, so models could be automatically evaluated across MoleculeNet tasks. It also leverages DeepChem’s positioning across multiple sciences, since I don’t really know of any other projects that are well-suited for this kind of hub.
The official wiki is up at https://wiki.openchemistry.org/GSoC_Ideas_2021#DeepChem_Project_Ideas. I’ve ported over some of these ideas to the wiki!
This is a really cool idea but might be a little too complex to pull off within GSoC. The new GSoC program is shortened (about a month and change for students) and this project is probably too hard. I’m really interested in making this happen though!
I’m happy to announce that DeepChem, through Open Chemistry, will be supported in Google Summer of Code! This means we can indeed invite student applications to GSoC
We also need more mentors! The more mentors we have, the more students we can support. If you’re a scientist who uses DeepChem and who is willing to help mentor a summer student, please get in touch with me. You can email me at bharath@deepforestsci (add the .com).
Adding on one more idea
- DeepChem Retrosynthesis: DeepChem currently doesn’t have good support for retrosynthesis tooling. This project would involve improving MoleculeNet support for common retrosynthesis datasets. This project should also pick and implement a good retrosynthesis model within DeepChem, perhaps leveraging https://github.com/ASKCOS/ASKCOS. ASKCOS is open source but MPL (which is like GPL), meaning we would have to be careful with license issues if we use it.
hello @seyonec, I’m also interested in this topic, do you have recommended reading for transformer for protein sequence? thanks
Awesome! Here is a more general intro to machine learning and computational biochemistry: https://www.notion.so/Computational-Biochemistry-e57c4194c4234a898ecf2db36bb74015
I’ll share some more specific papers as well:
UniRep - https://www.nature.com/articles/s41592-019-0598-1
TAPE (protein transfer learning benchmark) - https://www.biorxiv.org/content/10.1101/676825v1
@bharath Tagging Bharath if he has any other recommended papers!
Hi @seyonec ! I agree with you. Do you think that a combination with the use of EBI API (http request ) could help in models with protein sequences and maybe could be and idea for GSoC ?
this is an awesome repository of papers! thanks for sharing!
DeepChem and GSoC??!?
What could be a better way to spend part of a summer
In the last dev-call the idea of expanding DeepChem’s hyperparameter tuning through Ray came up as a possible GSoC project, and I would love to participate in this as a mentor.
I have been working with Ray-Tune extensively for my day job, and it feels like a soberly spec-ed implementation may well be within the scope of a GSoC project…
… and I would also say that hyper-parameters are really nice educationally due to their wide-applicability and the degree to which novel maths pop-up.
One modest goal that occurs to me would be perhaps to reach parity with the current HOpt (grid + gaussian) and then expand with with something a little bit more modern like HyperBand, and would be so happy to communicate with anyone else interested in these topics
The idea of adding functionalities for hyper-parameter tuning sounds really interesting and I would like to learn more about this project and discuss it. Could you share some resources/plan and also could your share your email (or any other preferred communication channel)?
I am highly interested in implementing support for pytorch lightning. and I would like to learn more about this project and discuss it. Could you share some resources/plan to get started?
Can you join one of the Developer calls? Joining the DeepChem Developer Calls
At a high level, the goal of the project is to make a
LightningModel class that can be used like
TorchModel to enable the construction of DeepChem models in PyTorch Lightning along with tests and documentation and tutorial. Glad to chat more more on the call
Hi @bharath, I have been actively practicing deep learning in PyTorch (personal bias: pytorch >>> tf ) and liked the ideas of lightning implementation and protein language modelling, and also the semiconductor modeling support is something I can’t get my head around. I would like to discuss on these, and get some planning and timeline sorted .
What would the purpose of that class be?
pytorch_lightning.LightningModule both perform roughly the same role. They wrap a
torch.nn.Module and provide an API for training, logging, validation, etc. If you already have one of them, what would you gain by wrapping it in the other?
Hi @bharath, I tried to build the pytorch lightining class for torch models and i would like to show you my work and have doubts to discuss.
Great question! I think primarily PyTorch Lightning appears to be emerging as a new standard for the PyTorch community for model building so seems useful to have a wrapper. But you’re absolutely right there’s overlap with
TorchModel. Perhaps we should discuss proper design on this week’s developer call?
Join our developer call for the week :). We can discuss design ideas there with @peastman too hopefully