AlphaFold can accurately predict 3D models of protein structures and has the potential to accelerate research in every field of biology.
Building blocks of life
Inside every cell in your body, billions of tiny molecular machines are hard at work. They’re what allow your eyes to detect light, your neurons to fire, and the ‘instructions’ in your DNA to be read, which make you the unique person you are.
Currently, there are around 200 million known proteins, with another 30 million found every year. Each one has a unique 3D shape that determines how it works and what it does.
But figuring out the exact structure of a protein remains an expensive and often time-consuming process, meaning we only know the exact 3D structure of a tiny fraction of the proteins known to science.
Finding a way to close this rapidly expanding gap and predict the structure of millions of unknown proteins could not only help us tackle disease and more quickly find new medicines but perhaps also unlock the mysteries of how life itself works.
Protein folding explained
The protein folding problem
If you could unravel a protein you would see that it’s like a string of beads made of a sequence of different chemicals known as amino acids.
These sequences are assembled according to the genetic instructions of an organism's DNA.
Attraction and repulsion between the 20 different types of amino acids cause the string to fold in a feat of ‘spontaneous origami’, forming the intricate curls, loops, and pleats of a protein’s 3D structure.
For decades, scientists have been trying to find a method to reliably determine a protein’s structure just from its sequence of amino acids.
This grand scientific challenge is known as the protein folding problem.
What is AlphaFold?
We started working on this challenge in 2016 and have since created an AI system known as AlphaFold.
It was taught by showing it the sequences and structures of around 100,000 known proteins.
Our latest version can now make accurate predictions of what shape a protein will form based on its sequence of amino acids.
This is a significant breakthrough and highlights the impact AI can have on science.
Joining a global research community
In 1994, scientists interested in protein folding formed CASP (Critical Assessment of protein Structure Prediction).
CASP is a community forum that allows researchers to share progress on the protein folding problem. The community also organises a biennial challenge for research groups to test the accuracy of their predictions against real experimental data.
Teams are given a selection of amino acid sequences for proteins which have had their exact 3D shape mapped but have not yet been released into the public domain. Groups must submit their best predictions to see how close they are to the subsequently revealed structures.
Among the teams that participated in CASP13 (2018), AlphaFold placed first in the protein structure prediction challenge. At CASP14 (2020), we presented our latest version of AlphaFold, which has now reached a level of accuracy considered to solve the protein structure prediction problem.
Our work builds upon decades of research by CASP’s organisers and the protein folding community, and we’re indebted to the countless number of people who have contributed protein structures over the years, making such rigorous evaluations possible.
AlphaFold: The making of a scientific breakthrough
When Covid-19 emerged, very little was known about it. But scientists around the world came together to find ways to tackle it.
SARS-CoV-2, the virus that causes Covid-19, is composed of about 30 kinds of proteins, and about ten of these were poorly understood.
Our research team used AlphaFold to predict the structures of six understudied proteins in the SARS-CoV-2 virus genome, in the hope that they might advance our understanding of the virus.
The structure of one of these proteins, known as ORF3a, was subsequently worked out using scientific experiments. And as part of CASP14, we demonstrated even more accurate predictions for ORF8, another SARS-CoV-2 protein.
These results offer a glimpse of how AI tools like AlphaFold could better prepare us for a future pandemic.
Accelerating scientific discovery
A system like AlphaFold that is able to accurately predict the structure of proteins could accelerate progress in many areas of research that are important for society.
For example, limited information on protein structures has been a major barrier to increasing our understanding of neglected tropical diseases like sleeping sickness (trypanosomiasis) and leishmaniasis, which impact the lives of millions of people and cause tens of thousands of deaths every year.
It also holds back many fundamental research efforts. For example, it can take over $2.5bn and more than 10 years to develop a new drug. AlphaFold could help contribute to better and more efficient drug discovery by identifying the structure of many human proteins involved in disease.
It could also help unlock new possibilities such as finding proteins and enzymes that break down industrial and plastic waste or efficiently capture carbon from the atmosphere.
There’s a lot more work to do before we’re able to help have a real impact in these areas and more, but the potential is enormous.
If AlphaFold may be relevant to your work, please submit a few lines about it to firstname.lastname@example.org. While our team won’t be able to respond to every enquiry, we’ll be in contact in cases where there’s scope for further exploration.
Looking to the future
Our research on AlphaFold continues, but our work so far – and the independent assessments from organisations like CASP – strengthens our hope that its predictions will soon help unlock new possibilities in biological research that will benefit society.
We’re excited about this next phase of AlphaFold’s journey and looking forward to continuing our work with the global scientific community to unlock the potential of the building blocks of life.