DeepChem GSoC 2025 Project Ideas

Google Summer of Code 2025 Proposal
Project: Improving New Drug Modality Support
Name: Ridwaan Salie
University: University of the People (Final Year)
Programming Languages: Python, SQL
Relevant Skills: AI & ML, Data Analytics

Personal Motivation & Background
I have always had a strong passion about medicine and Healthcare and I still do, I firmly believe that technology, particularly AI and data science, can shape the future of healthcare for the better. My academic journey has been deeply rooted in AI, ML, and Data Analytics, with a focus on how these fields can enhance healthcare systems.

Currently, I am in my final year at the University of the People, where I have developed strong programming and analytical skills. I also hold a Data Analytics Certificate from Explore AI and have a diploma in Computer Science. My interest in AI-driven healthcare solutions aligns perfectly with DeepChemā€™s mission to advance computational chemistry and drug discovery through machine learning.

This project, which focuses on enhancing support for emerging drug modalities, resonates with my goal of using AI to drive innovation in medicine. By contributing to this project, I hope to expand DeepChemā€™s capability to work with cutting-edge therapeutics, making it more accessible to researchers and biotech firms.

We all aware that morden day drugs discovery is rapidly evolving with new modalities like PROTACs, antibody-drug conjugates, macrocycles, and oligonucleotides. These therapeutics offer greater precision in targeting diseases, but computational tools like DeepChem still lack adequate support for them.

This project aims to:
1.) Develop tutorials and examples showcasing how to work with these new drug modalities using DeepChem.
2.) Identify and process relevant datasets to improve DeepChemā€™s modeling capabilities.
Expand the tooling available in DeepChem for researchers and biotech startups.
3.) By improving DeepChemā€™s support for these next-generation therapeutics, this project will enhance AI-driven drug discovery, leading to faster and more efficient medical breakthroughs.

Why me?

Well first of all Iā€™m new to GSCO and to me this would be a great learning opportunity and also broaden my skillset, I will also be getting insights and exceptional mentorship by mentors assigned. I am well-suited for this project because of my technical skills, healthcare knowledge, and strong motivation to bridge AI with medicine.

Technical Qualifications:

  1. AI & Machine Learning: Have some experience in deep learning, data preprocessing, and model development.

  2. Data Analytics: Familiar with dataset identification, cleaning, and processing for predictive modeling.

  3. Programming: Proficient in Python, with experience in libraries such as TensorFlow, PyTorch, and Pandas.

I am eager to collaborate with the DeepChem team, learn from experienced mentors, and contribute high-quality documentation and tutorials to help expand the communityā€™s understanding of emerging drug modalities.

Development Methodology
I will approach this project in the following way:

1.) Understanding New Drug Modalities & DeepChemā€™s Current Support
1.1) Research PROTACs, antibody-drug conjugates, macrocycles, and oligonucleotides.
1.2) Identify how DeepChem currently handles molecular data and what needs improvement.
1.3) Engage with the DeepChem community and mentors to refine the projectā€™s scope.

2.) Building Tutorials & Code Examples
2.1) Develop beginner-friendly tutorials on emerging drug modalities.
2.2) Implement examples showcasing how to process and analyze these drugs using DeepChem.
2.3) Ensure all code follows best practices and is well-documented.

3.) Dataset Identification & Processing:
3.1) Find and curate datasets relevant to these drug modalities.
3.2) Preprocess data to make it compatible with DeepChemā€™s framework.
3.3) Explore benchmarking AI models on these datasets.

  1. Testing & Documentation:
    4.1) Run tests to validate tutorials and dataset integration.
    4.2) Gather feedback from the DeepChem community.
    4.3) Write comprehensive documentation for future users.

I am excited about the opportunity to contribute to DeepChem and help improve AI-driven drug discovery. This project perfectly aligns with my passion for medicine and technology, and I am eager to learn from the open-source community while making a meaningful impact. Thank you for considering my application. I look forward to working with the DeepChem team!

Hi, I am Jai, new to the community.
Iā€™m interested in the ā€œNumPy 2.0 upgradeā€ , as I code in python most of the time and faced similar issues while running models on Colab and believe I am well suited for it, while also learning a lot from the debugging experience.
Iā€™m also open to working on ā€œImproving New Drug Modality Supportā€.
Wish to hear from you soon.

Hello. I am Krushna Jaybhaye. I am a recent graduate in Computer Engineering. I am particularly interested in applications of Machine Learning in the field of natura sciences.

For this years GSoC, I am interested in ā€œImplement a Wishlist Modelā€. I have worked on implementation of various Deep Learning model architectures as part of personal projects and believe this to be a match for my skills. For now I am interested in implementing the ā€œMXMNetā€ model (or itā€™s improved version ā€œPAMNetā€.

Other projects that I am interested in include ā€œNumPy 2.0 upgradeā€ and ā€œPyTorch Portingā€. I mostly code in Python and am quite comfortable in solving issues related to version upgrade.

I appreciate any suggestions about approaching the proposal process.
Thank you.

Hello Dear DeepChem Team,

I am writing about my interest in contributing to DeepChemā€™s ā€œ Conversion of Smiles to IUPAC and IUPAC to smilesā€or ā€œHuggingFace-style easy pretrained-model Load ā€ ideas during Google Summer of Code. First, I would like to tell you a little about myself, and then I would like to talk about how I can contribute to this issue. I would be thrilled to contribute to the work of the highly talented scientists and engineers there, with or without a GSOC award . I would like to mention the time boundation for both projects like ā€œ 175 hours ā€œ for that IUPAC project and around ā€œ300 hoursā€ for HuggingFace - style project .

But , applying for these projects is not only for getting into GSOCā€™25 , instead i want to deepen my research knowledge for these topics as i have worked over these topics . i would love to work after the GSOC timeline too as for my interest and finding a perfect organisation.

I am in my Final year of my B.Tech Degree

Now , talking about my qualifications :

  • For ā€œConversion of Smiles to IUPAC and IUPAC to smiles ā€ project i have qualified JEE Mains and Advance at my entrance for UG programs and scored a good AIR , and scored great grades in my SSC(secondary higher school ) of A in Physics & Chemistry and A+ in Mathematics . plus , my interest in chemistry in this topic have always been great .

  • For ā€œHuggingFace-style easy pretrained-model Load ā€ project : I have recently made a recruiting platform where i used huggingFace for ease matching an working and integrated my project with OpenAI for other functions in that platform which i think will be helpful here in this topic . my eagerness to learn new things and my quick learning ability will also be of use of work here and will help me in learning and using new technologies.

  • I have also tried to fix one of your issue ā€œ Warning or error before saving a dataset to a directory with pre-existing files #3402 ā€ in which i have provided the approach to solve the issue as there was no code provided . i will link the issue here also .

Plan for working on ā€œHuggingFace-style easy pretrained-model Load ā€ Project :

Week 1-2: Research & Metadata Design
Week 3-4: API Implementation
Week 5-6: Testing & Compatibility
Week 7-8: Optimization
Week 9-10: Documentation & Finalization

Plan for working on ā€œConversion of Smiles to IUPAC and IUPAC to smiles ā€ Project :
Week 1-2: Research & Data Collection
Week 3-4: Initial Prototyping
Week 5-6: Optimization & Accuracy Improvement
Week 7-8: Final Testing & Documentation
Week 9: PR Submission & Refinements

I would like to convey that I am very excited and eager to contribute to DeepChem. I hope I can do quality work on this subject . Looking for a positive feedback from your side.

I am not able to attach more than 2 links as a new user , you can contact me anytime through my gmail , i can provide you with all the necessary details of mine there.

Gmail : nishantg1202@gmail.com

Yours Sincerely ,

Nishant Gupta

Dear DeepChem Team,

I hope youā€™re doing well. Iā€™m excited to contribute to DeepChem and would like to work on ā€œImplement a Wishlist Modelā€ and ā€œImprove Equivariance Support.ā€

For the Wishlist Model project, Iā€™m particularly interested in implementing models like Hamiltonian/Lagrangian Neural Networks or Physics-Inspired Neural Operators (PINO) to strengthen DeepChemā€™s physics capabilities.

For Equivariance Support, Iā€™d love to explore tensor field networks and improve DeepChemā€™s support for equivariant models in 3D molecular modeling. Given the importance of equivariance in deep learning, I believe this will be a valuable contribution.

Iā€™d appreciate guidance on getting started, including any key resources or discussions to review.

Best regards,
Kapil

Hi DeepChem Team,

Iā€™m excited to apply for GSoC 2025 under DeepChem, specifically for the project Improving Support for Drug Formulations. As a computer science student at Ashesi University with a strong background in machine learning, computational chemistry, and software development, I believe I can contribute meaningfully to this initiative. My experience includes working with deep learning frameworks like TensorFlow, Keras, and PyTorch, as well as cheminformatics tools like RDKit and DeepChem. I have applied ML techniques to various domains, including NLP, market basket analysis, and bioinformatics, and have a keen interest in leveraging AI for drug discovery. Iā€™m particularly excited about developing a tutorial that introduces drug formulation concepts while integrating DeepChem-based computational approaches for formulation design. I would love to discuss how I can align my skills with the projectā€™s goals and if there are any preliminary contributions I can make before the application period. Looking forward to your guidance!

Best,
Brigidi Blay

Dear DeepChem GSoC Team,

I am writing to express my interest in your project task ā€œImplement a Wishlist Model.ā€

Iā€™m a mathematician and physicist pursuing a PhD in Artificial Intelligence at IDEAS NCBR, a premier Polish research institute recognized for groundbreaking AI research, featuring top scientists with multiple ERC grants. I bring over two years of industry experience in machine learning, specializing in Fourier Neural Operators and physics-informed methods, supported by three published papers.

Relevant technical skills:

  • Python, scientific ML libraries, Git/GitHub
  • VS Code, Jupyter Notebooks, HPC Cluster Management
  • Deep Learning, Physics-Informed Neural Networks, Neural Operators, Diffusion Models

My background aligns precisely with your advanced project on Physics Inspired Neural Operators (PINO), positioning me to effectively handle its technical and computational challenges within DeepChem. Iā€™m particularly excited about contributing to this task of implementing a Wishlist Model, enhancing DeepChemā€™s capabilities.

Thank you for considering my applicationā€”I look forward to potentially contributing to DeepChem.

Best regards,

Hi DeepChem team,

I hope youā€™re doing well. Iā€™m really interested in applying for GSoC 2025 with DeepChem and have explored the project ideas. Two projects particularly stand out to me:

  1. PyTorch Porting ā€“ Given DeepChemā€™s transition to PyTorch, I would love to contribute by porting models from TensorFlow to PyTorch. I have experience with both TensorFlow and PyTorch, and Iā€™m confident I can help ensure compatibility and correctness while testing and debugging the models.

  2. Conversion of SMILES to IUPAC and vice versa ā€“ This project excites me because of its intersection with Python, NLP, and data processing. Iā€™m looking forward to building tools for this conversion process, improving my skills in algorithm development and API creation.

In addition to my programming experience, Iā€™ve also cracked the JEE (Joint Entrance Examination), which has given me a strong foundation in problem-solving and technical concepts. Furthermore, I have significant experience with Organic Chemistry, which I believe will be incredibly useful for this project, particularly when working with chemical data structures and understanding molecular representations.

Though I havenā€™t contributed to DeepChem yet, Iā€™m an active open-source contributor, primarily focusing on deep learning and machine learning frameworks. Iā€™m eager to leverage my skills in coding, debugging, and integrating solutions to contribute effectively to your projects.

What are the key areas you would recommend I explore further before the application period? Also, are there specific aspects of the projects where you think I should focus my attention to ensure the best results?

Looking forward to hearing your thoughts!

Best,
Kunj Purvish Shah

Hi DeepChem Team,

Iā€™m interested in applying for GSoC 2025 under DeepChem and have explored the project ideas. One idea that particularly interests me is:

Upgrading DeepChem to NumPy 2.0 ā€“ DeepChem currently runs on an older version of NumPy, and upgrading to 2.0 comes with some breaking changes. This project involves spotting and fixing compatibility issues, making sure everything runs smoothly, and ensuring the transition doesnā€™t cause issues. Since I have experience working with Python and NumPy, and pandas, I enjoy debugging and improving open-source projects, Iā€™d love to take this on and contribute to the upgrade.

I am looking forward to contributing to DeepChem, but Iā€™ve been working on Python projects, which has strengthened my problem-solving skills. Iā€™m eager to dive into the codebase, learn, and deliver impactful contributions.

Iā€™d appreciate any guidance about the project and any suggestions. Iā€™ve also started reviewing the project and contributing before the application deadline.
Looking forward to your thoughts!

Sincerely,
Pratham

Hi, Iā€™m really excited about the ā€œConversion of SMILES to IUPAC and IUPAC to SMILESā€ project and would love to take it on for GSoC!

I have a background in software development and AI, and I find the challenge of molecular representation super interesting. The idea of building a tool that makes these conversions seamless while contributing to the DeepChem ecosystem sounds like an awesome learning opportunity.

Iā€™d love to hear more about any existing work in this area and any key challenges you see in this project. Looking forward to your thoughts and to working with the community!

Hi DeepChem Team,

Iā€™m Jinfeng Huang, a chemist with a deep passion for using machine learning to make exciting advances in chemistry and biology. The project ā€œ Conversion of Smiles to IUPAC and IUPAC to smiles ā€ attracts my great attention, and I believe it will benefit scientists from broad fields once the APIs launched.

Iā€™m experienced in empowering small molecule drug discovery with computational chemistry, developing and using computational methods/ML for structural biology, investigating pathological mechanism of disease related proteins with biophysics. Iā€™m quite familiar with the smiles form due to my contributions to the largest blood-brain barrier molecule dataset, B3DB (https://github.com/theochem/B3DB).

Iā€™m really excited to come across with this opportunity and look forward to having the chance to work with your team.

With regards,

Jinfeng

Greeting DeepChem team,

Iā€™m Assia, a computer science student with a big passion for machine learning and scientific computing. I recently discovered the beginner project ā€œLayer Tutorialsā€ from your GSoC 2025 idea list, and Iā€™m truly excited about the chance to contribute!

Iā€™m pretty comfortable with Python, have some experience using Jupyter Notebooks, and enjoy writing and explaining technical content.Right now, Iā€™m delving deeper into deep learning and would be glad to assist improve the DeepChem documentation and create simple tutorials for the community.

Could you share how I can get started? Iā€™d really appreciate any suggestions or beginner issues I could work on while I prepare my proposal.

Looking forward to hearing from you!

Best,

E-mail: iiiassia.beniii@gmail.com
Github: https://github.com/tuba89

Dear DeepChem Team,
Iā€™m Ashlee Chen and I admire the work that is being done at DeepChem to advance open source tools for computational biology and drug discovery. As an undergraduate biomedical engineering student at Stevens Institute of Technology partaking in science research in drug development and discovery under professor and PhD student mentorship and more generally, someone keen on pursuing a career in this, I am particularly drawn to two projects as outlined above.

Having worked on drug development research where I utilized platforms like Schrodinger and Maestro to test protein models, I have developed a strong understanding in the role of innovative computational tools. As such, I am drawn to the Improving New Drug Modality Support project where I can focus on emerging modalities such as PROTACs and macrocycles and contribute to creating tutorials and datasets while working with these technologies.
Moreover, I am fascinated by the Improving Support for Drug Formulations project because of the way it bridges the computational aspects of drug design with real world, tangible patient outcomes. My research and engineering background have equipped me to tackle challenges in computational modeling and tutorial development and I am passionate about transforming these developments into theoretical models to practical, meaningful solutions to reach patients.

I am excited about the opportunity to collaborate with the DeepChem community and believe I have tremendous potential to grow and further my academic and personal pursuit in biomedical engineering and a career in drug development. Thank you for considering my application.

Best regards,
Ashlee Chen

Hi DeepChem Team,

Iā€™m interested in applying for GSoC 2025 under DeepChem and would love to work on the SMILES ā†” IUPAC Conversion project. My background is in Electronics and Machine Learning , and I have a strong interest in cheminformatics and algorithm optimization.

I plan to leverage tools like RDKit and Open Babel , explore Transformer-based models for molecular name conversions, and ensure seamless integration within DeepChem. Additionally, I aim to implement a user-friendly API with comprehensive testing and documentation.

While Iā€™m new to contributing to DeepChem, I have experience in **ML, algorithm development, and open-source contributions. Iā€™d love to hear your thoughts on how I can best align my skills with this project and any recommendations on where to start contributing.

Looking forward to your guidance!

Best regards,
Mohith Akshay

Dear DeepChem Team,

I hope youā€™re doing well. My name is Abhishek IJ, a current BTech Information Science and Engineering student. I am excited about the opportunity to contribute to DeepChem through Google Summer of Code (GSoC) 2025, particularly to projects that integrate advanced AI techniques with innovative data solutions.

I have strong expertise in front-end development, C/C++, Python, AI, and Transformers. Additionally, I have completed projects on health disease prediction and adah prediction using AI, which have enhanced my practical understanding of applying machine learning to real-world challenges.

I would appreciate any guidance on the next steps or prerequisites for contributing to DeepChem. I look forward to your response!

Best regards,
Abhishek IJ

DeepChem team,

I hope youā€™re doing well.

My name is Sunaina, and I am excited about the opportunity to contribute to DeepChem through Google Summer of Code (GSoC) 2025 . With a strong background in Artificial Intelligence and Machine Learning , I am particularly interested in the ā€œImproving New Drug Modality Supportā€ project.

I have a solid foundation in computational chemistry , having cleared IIT JAM 2023 in Chemistry , which enables me to understand complex molecular structures and their properties. Additionally, I have worked on a disease prediction project using AI , which is available on my GitHub (https://github.com/Sunaina12-tech).
I would love the opportunity to discuss this further and align my contributions with the projectā€™s goals. Please let me know a convenient time for a discussion or any additional steps I should take to get started.

Looking forward to your guidance.

Best regards,
Sunaina

I recently came across the project ā€œImproving Support for Drug Formulationsā€ , and I am very interested in contributing. As a Chemical Engineering student, I have a strong background in pharmaceutical formulations, computational modeling, and chemical process design.

I am particularly excited about this project because drug formulations play a crucial role in making medicines accessible and effective . Additionally, I have experience with scientific computing and data analysis, and I am eager to apply computational tools like DeepChem to enhance drug formulation research.

I would love the opportunity to discuss how my skills and enthusiasm align with the projectā€™s goals.

Hello DeepChem team,

Iā€™m Kanak Jaiswal, a graduate student specializing in Artificial Intelligence, with a strong background in Machine Learning. Iā€™m excited about the opportunity to contribute to the Numpy 2.0 Upgrade project for Google Summer of Code (GSoC) 2025.

The prospect of tackling the Numpy 2.0 upgrade and resolving any compatibility issues really excites me. I see it as a fantastic opportunity to deepen my debugging skills and get hands-on experience with version control and software maintenance in an open-source environment. Iā€™m confident that this project will not only help me grow but also contribute to the long-term stability and success of DeepChem.

Iā€™d love to discuss this project further and see how my skills might align with the goals. Please let me know if thereā€™s anything else I should do to get started.

Looking forward to hearing from you!

Hi DeepChem team,

Iā€™m interested in applying for GSoC 2025 under DeepChem and have explored the project ideas. Two ideas particularly interest me:

  1. Implementing a Wishlist Model ā€“ Iā€™d love to work on integrating a new model into DeepChem. Implementing such models aligns with my deep learning background and interest in applying ML to scientific computing.
  2. PyTorch Porting ā€“ Given DeepChemā€™s shift towards PyTorch, Iā€™m also interested in porting Chemception to PyTorch and ensuring full compatibility with DeepChemā€™s ecosystem. I have prior experience working with deep learning frameworks like PyTorch and TensorFlow, so this would be an exciting challenge.

While I have not contributed to DeepChem yet, I have been an active open-source contributor, particularly in sktime , where I implemented the BoxCoxBiasAdjustedForecaster and am currently adding a deep learning-based forecasting model. I have experience debugging complex ML issues, writing clean and maintainable code, and ensuring proper integration of my implementations.

Iā€™d love to hear your thoughts on how I can best align my skills with the projectā€™s needs. Additionally, do you have any recommendations on specific areas I should explore or contributions I can make before the application period?

Looking forward to your guidance!

Best,
Yashkumar Chandwani

I am Nnaemeka Anyadike , a pharmacology major and have also taken courses and trainings in Machine Learning and Artificial intelligence (ML/AI). I have over 4 years experience in ML/AI and I am interested in contributing to the layer tutorials.

1 Like