Predictions of most human protein structures made freely available
A a good understanding of the structure of a protein can shed crucial light on the mechanism of certain biological processes or provide a starting point for the development of a new drug. AlphaFold, a program from British artificial intelligence firm DeepMind, has made significant progress in reducing the time it takes to predict a protein’s structure from months to minutes with unmatched accuracy. Now an article published on July 22 in Nature reports that a collaboration between AlphaFold and the European Molecular Biology Laboratory (EMBL) has created a publicly accessible database containing more than 350,000 protein structures.
“This understanding means that we can be better equipped to unravel the molecular mechanisms of life and accelerate our efforts to protect and treat human health, as well as the health of our planet, and making this open-access tool will accelerate the power of discovery of research and innovation for scientists around the world, ”said Edith Heard, Executive Director of EMBL The Guardian.
The human proteome, that is, all the proteins that human DNA is known to encode, consists of around 20,000 proteins. Laboratory analysis confirmed the structures of only about 17% of these molecules. Before the advent of modern neural networks and computer processors, computer predictions of structures were time consuming and often inaccurate. DeepMind reports that the new database includes structures for 98.5% of the human proteome with confidence or a high degree of confidence for accuracy. Proteins of 20 model organisms, including Caenorhabditis elegans and Drosophila melanogaster, are also included in the database, bringing the total to 350,000 structures.
See “DeepMind AI speeds up the time to determine protein structures”
Last December, AlphaFold won the biennial Critical appraisal of protein structure prediction (CASP), becoming the first program to exceed 90 percent accuracy. This has already been a boon for some scientists who have used AlphaFold in their research.
“It’s just the speed, the fact that it took us six months per structure and now it takes a few minutes. We couldn’t really have predicted it would happen so quickly, ”said structural biologist John McGeehan of the University of Portsmouth. BBC. “When we first sent our seven sequences to the DeepMind team, two of them already had the experimental structures. We were therefore able to test them on their return. It was one of those times – to be honest – where the hairs stood on the back of my neck because the structures [AlphaFold] products were identical.
DeepMind says it will be able to expand the database from 350,000 structures to 130 million by the end of this year.
Beyond the exploration of existing proteins, Nature reports, access to this treasure could also facilitate the development of synthetic proteins, as it could be more reliably predicted how they will interact with other proteins.
AlphaFold is not the only protein folding program. For example, RoseTTAFold, which was inspired by AlphaFold, relies on this technology to calculate information in different ways. It was released to the public last week and its creators say they expect it to benefit from the new database.
“It’s fantastic that they made this available,” said David Baker, one of the architects of RoseTTAFold. Science. “It will really increase the pace of research.”