
Molecular machines, a chromatin remodeler (pink and green at left) and RNA polymerase II (gray, yellow and blue at center), work together to read genomic information stored on compact DNA (white spool). Credit: Farnung Lab
For Lucas Farnung, nothing is more fascinating than how a fertilized egg cell transforms into a fully functioning human being. As a structural biologist, he studies this process at the smallest scale: the billions of atoms that must synchronize their work to make it happen.
“I don’t see a big difference between solving a 5,000-piece puzzle and the research we do in my lab,” says Farnung, an assistant professor of cell biology at the Blavatnik Institute at Harvard Medical School. “We’re trying to understand what this process looks like visually, and from there we can get a sense of how it works.”
Nearly every cell in the human body contains the same genetic material, but the type of tissue those cells become during development (for example, liver or skin) is largely determined by gene expression, which determines which genes are turned on or off. Gene expression is regulated by a process called transcription, which is at the heart of Farnung’s work.
During transcription, molecular machines read the instructions contained in the genetic blueprint stored in DNA and create RNA, the molecule that carries out the instructions. Other molecular machines read the RNA and use this information to make proteins that power almost every activity in the body.
Farnung studies the structure and function of the molecular machines responsible for transcription.
In a conversation with Harvard Medicine News, Farnung spoke about his work and how machine learning is accelerating research in his field.
What is the central question your research seeks to answer?
I always say that we’re interested in the smallest logistical problem. The human genome is present in almost every cell, and if you stretched out the DNA that makes up the genome, it would be about six feet long. But that six-foot molecule has to fit into the nucleus of a cell, which is only a few microns long. It’s like taking a fishing line that stretches from Boston to New Haven, Connecticut, about 150 miles, and trying to fit it into a football.
To do this, our cells compact DNA into a structure called chromatin, but then the molecular machines can no longer access the genomic information on the DNA. This creates a conflict, because the DNA must be compact enough to fit into the nucleus of a cell, but the molecular machines must be able to access the genomic information on the DNA. We are particularly interested in visualizing the process by which a molecular machine called RNA polymerase II accesses the genomic information and transcribes the DNA into RNA.
What techniques do you use to visualize molecular machines?
Our general approach is to isolate molecular machines from cells and observe them using specific types of microscopes or X-ray beams. To do this, we introduce genetic material encoding a human molecular machine of interest into an insect or bacterial cell, so that the cell makes a large amount of that machine. Then we use purification techniques to separate the machine from the cell so that we can study it in isolation.
But it gets complicated, because often we’re not just looking at one molecular machine, which we also call a protein. There are thousands of proteins that interact with each other to regulate transcription, and so we have to repeat this process thousands of times to understand these protein interactions.
Artificial intelligence is beginning to intrude into many facets of fundamental biology. Is this changing the way you do research in structural biology?
For the past 30 or 40 years, research in my field has been a tedious process. It would take a PhD student’s career to learn a little bit about a single protein, and it would take thousands of students’ careers to understand how proteins interact in a cell. However, in the last two or three years, we have increasingly turned to computational approaches to predict protein interactions.
The discovery of AlphaFold, a machine learning model that can predict protein folding by Google DeepMind, was a major breakthrough. Importantly, the way proteins fold determines their function and interactions. We are now using artificial intelligence to predict tens of thousands of protein interactions, many of which have never been described experimentally before. Not all of these interactions actually happen inside cells, but we can validate them through laboratory experiments.
This is very exciting because it really accelerates our science. When I look back on my PhD, the first three years were a failure: I couldn’t find any protein interactions. Now, with these computational predictions, a PhD student or postdoc in my lab can be pretty confident that a lab experiment that’s trying to validate a protein interaction is going to work. I call it molecular biology on steroids, but legal, because now we can get to the question we want to answer much faster.
Besides efficiency and speed, how is AI transforming your field?
One of the most exciting changes is that we can now, in an unbiased way, test any protein in the human body against any other protein to see if they might interact. Machine learning tools in our field are causing similar disruptions to those caused by personal computers in society.
When I first started as a researcher, X-ray crystallography was used to reveal the structure of individual proteins, a beautiful, high-resolution technique that can take many years. Then, during my PhD and postdoctoral studies, cryo-electron microscopy, or cryo-EM, came along, a technique that allows us to look at larger, more dynamic protein complexes at high resolution. Cryo-EM has led to many advances in our understanding of biology in the last 10 years and has accelerated drug development.
I thought I was lucky enough to be part of the so-called resolution revolution brought about by cryo-EM. But now I feel like machine learning for protein prediction is bringing a second revolution, which is just amazing to me and makes me wonder what other acceleration we are going to see.
I think we can do research five to ten times faster today than we could ten years ago. It will be interesting to see how machine learning transforms the way we do biological research over the next ten years. Of course, we have to be careful how we handle these tools, but I find it exciting to be able to make discoveries about problems I’ve been thinking about for a long time ten times faster.
What are the downstream applications of your work beyond the laboratory?
We are learning to understand how biology works in the human body, but understanding basic biological mechanisms can help us develop effective treatments for a variety of diseases. For example, it turns out that disruption of the DNA-chromatin structure by molecular machines is a major driver of many cancers. Once we understand the structure of these molecular machines, we can understand the effect of changing a few atoms to replicate the mutations that would lead to cancer, and we can then begin to design drugs that target proteins.
We have just started a project in collaboration with the HMS Therapeutic Initiative This involves studying a chromatin remodeler, a protein that is highly mutated in prostate cancer. We recently obtained the structure of this protein and are performing virtual screens to see which chemicals bind to it. We hope to be able to design a compound that inhibits the protein and that has the potential to become a drug in its own right that could slow the progression of prostate cancer.
We also study proteins involved in neurodevelopmental disorders like autism. This is an area where machine learning can help us, because the tools we use to predict protein structures and protein interactions can also predict how small molecule compounds will bind to proteins.
Speaking of collaboration, how important is working across research fields and disciplines to your research?
Collaboration is very important to my research. The landscape of biology has become so complex, with so many different research niches, that it is impossible to understand everything. Collaboration allows us to bring together people with different expertise to work together on important biological problems, such as how molecular machines access the human genome.
We collaborate with other HMS researchers on many different levels. Sometimes we use our structural expertise to support the work of other labs. Other times, we have solved the structure of a certain protein, but we need to collaborate to understand the role of that protein in the broader cellular context. We also collaborate with labs using other types of molecular biology approaches. Collaboration is really fundamental to driving progress and better understanding of biology.
Quote: Q&A: How Machine Learning Is Propelling Structural Biology (2024, July 22) retrieved July 22, 2024 from https://phys.org/news/2024-07-qa-machine-propelling-biology.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without written permission. The content is provided for informational purposes only.