Google sister company DeepMind releases massive database of AI-predicted protein shapes in human body and other organisms in ‘transformative’ breakthrough
London-based artificial intelligence company DeepMind says it has used its AlphaFold technology to predict the shapes of nearly every protein in the human body, as well as many others.
The breakthrough is expected to mean significant advances in understanding diseases and designing new drugs.
The information could be used to design crops resistant to climate change or enzymes that can break down plastic.
DeepMind, which was acquired by Google in 2014 and is now owned by Google parent Alphabet, said in December of last year that it had successfully used AI with AlphaFold to predict the structure of proteins.
It has now released the source code for AlphaFold, along with a database of 350,000 newly predicted protein structures.
Along with most of the human body’s proteins, the database also includes structures found in 20 of the most widely studied organisms, including yeast, fruit flies and mice.
DeepMind said it plans to predict and release the structures for more than 100 million more proteins in the next few months – more or less all the proteins currently known to science.
“We believe that this work represents the most significant contribution AI has made to advancing the state of scientific knowledge to date, and is a great example of the kind of benefits AI can bring to society,” said DeepMind founder and chief executive Demis Hassabis.
Proteins are a fundamental building blocks of an organism, and are in turn made of ribbons of amino acids that form into particular shapes depending on the protein’s function.
Determining a protein’s shape can take weeks or months in a lab, but can be predicted in a day or two by AlphaFold.
DeepMind’s database of protein shapes is intended to make it easier for researchers, meaning they won’t have to run the software on their own systems.
The current version of AlphaFold gives its predictions a confidence score indicating how close the shape is to the real thing.
Under this system, 36 percent of AlphaFold’s human proteins were given the highest score, indicating accuracy down to the level of individual atoms. At that level, the prediction can be used for drug development, Hassabis said.
To date only 17 percent of the human body’s proteins have had their shapes identified in laboratories. If AlphaFold’s predictions are as accurate as DeepMind believes, the software has now more than doubled the number of known human protein shapes.
For another more than half of the human body’s proteins AlphaFold predicted the shape with enough confidence for researchers to determine the protein’s function.
The rest of the predictions either have a lower degree of confidence or apply to the body’s proteins that don’t have a structure until they bind with others, DeepMind said.
The company is currently releasing its tools and predictions for free, and declined to comment if it has plans for making money from them in the future, but didn’t rule out the possibility.
To operate the database the firm is working with the European Molecular Biology Laboratory (EMBL), which already hosts a large database of protein information.
“This will be transformative for our understanding of how life works,” said Prof Edith Heard of EMBL. “The applications are limited only by our understanding.”