AI-driven biotech companies have recently made numerous announcements about the potential of extended language models (LLMs) in early drug discovery. In part two of a three-part series, Dr. Raminderpal Singh shares a use case of ChatGPT, which demonstrates how accessible LLMs have become to lab scientists.


In our previous article, we summarized the role and challenges of LLMs for early drug discovery. In this article, we provide a simple case example to download and practice with ChatGPT or other accessible LLM systems. This shows the power that LLMs offer to improve scientists’ daily tasks, despite their caveats and challenges. You can download all the source files to use them yourself1 – (see the Simple ChatGPT exercise). Thanks to Nina Truter2 for his support in creating this example. The example should work with any LLM but has been tested with ChatGPT.3
About the example
- Purpose of the example: To use measurements extracted from 10 articles on acarbose-treated mice to improve recommendations made from the results of the main study.
- Key expected outcomes of the example: Recommendation on dose, participants and measures based on the results of the main study4.5 and articles on acarbose-treated mice, with supporting data.
- Challenges encountered in implementing the example: creating prompts to accurately extract information to support recommendations, accurately describing the contents of multiple files and documents.
It is important to know that generally accessible LLM systems often share the information you provide, so it is recommended that you do not enter confidential information.
To help ChatGPT provide useful information, some “prompt engineering” is required. This is a technical term for best practices in writing prompts. As an example, the first prompt in this example is just to provide context and background for ChatGPT:
“You are a drug discovery scientist looking to make decisions about dose, participants, and measures when introducing an existing diabetes drug into the field of aging-related diseases. You have experimental results from a mouse study that show the effects of acarbose on lifespan, body weight, body composition, fat pads, glucose, grip strength, grip duration, rotarod, and pathology. You also have several relevant scientific publications with studies examining the effects of acarbose on various measures in mice. You now want to query the results of your study (which are in Excel files and images) and the publications separately for insights, and then together to come up with the best set of recommendations for your colleagues looking to conduct early clinical trials with acarbose in aging-related diseases. To do this, you will now process a series of specific user-entered ChatGPT prompts.”
The screenshot below shows the results of the last prompt. ChatGPT missed some nuances. For example, female mice don’t live as long as male mice, but their physical measurements are improved. Improved prompts will help generate more nuanced results.
Figure by Dr. Raminderpal Singh, illustrating rapid results.
Please comment below to share your findings from this example. Let us know if you managed to improve the result and if so, how?
The next article in this series, published on Monday, July 24, will address the main challenges in effectively using LLMs for early drug discovery and present some practical approaches to address them.
The references
1 Lecture. HitchhikersAI.org. Available at: https://www.hitchhikersai.org/reading
2 Nina Truter. LinkedIn. Available at: https://www.linkedin.com/in/nina-truter/
3 ChatGPT. Available on: https://chatgpt.com/
4 Alavez S, et alAcarbose improves health and lifespan in aging HET3 mice. Cellular aging. 18(2) (April 2019). Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6413665/
5 Harrison DE, et al. ITP: Intervention Testing Program: Effects of various treatments on lifespan and associated phenotypes in genetically heterogeneous mice (UM-HET3) (2004-2023). Mouse Phenome Database. Available at: https://phenome.jax.org/projects/ITP1
About the Author
Dr. Raminderpal Singh
Dr. Raminderpal Singh is a recognized thought leader in the biotechnology industry. He has over 30 years of international experience leading and advising teams on building computational modeling systems that are both cost-effective and have significant intellectual property value. His passion is helping early-stage and growth-stage life sciences companies achieve novel biological breakthroughs through the effective use of computational modeling.
Raminderpal currently leads the open source community HitchhikersAI.org, which accelerates the adoption of AI technologies in early drug discovery. He is also CEO and co-founder of Incubate Bio, a biotech company providing a service to life sciences companies looking to accelerate their research and reduce their lab costs through in silico modeling.
Raminderpal has extensive experience building companies in Europe and the US. As a Business Director at IBM Research in New York, Dr Singh led the commercialization of IBM Watson Genomics Analytics. He was also Vice President and Head of the Microbiome Division at Eagle Genomics Ltd, Cambridge. Raminderpal received his PhD in semiconductor modeling in 1997. He has published several articles and two books and has twelve patents issued. In 2003, he was selected by EE Times as one of the 13 most influential people in the semiconductor industry.
For more: http://raminderpalsingh.com ; http://hitchhikersAI.org ; http://incubate.bio