A technique to anticipate the structure of hundreds of millions of proteins using artificial intelligence has been developed by Facebook parent company Meta Platforms Inc. According to researchers, it has the potential to accelerate the development of novel medications and increase scientists’ understanding of biology.
The new AI-based computer tool ESMFold was used by Meta AI, the research division of Meta, to build a public database of 617 million predicted proteins. The operation of tissues, organs, and cells depends on proteins, which are the fundamental components of life and many treatments.
Heart disease, certain malignancies, and HIV are among the disorders that are treated using medications based on proteins, and many pharmaceutical companies have started to develop new drugs using artificial intelligence. It is anticipated that using AI to predict protein structures may increase the efficiency of current medications and drug candidates as well as aid in the discovery of compounds that might treat ailments for which there is still no viable treatment.
With ESMFold, Meta is squaring up against AlphaFold, another protein-prediction computer model from DeepMind Technologies, a division of Alphabet Inc., the parent company of Google. With its database, AlphaFold claims to have 214 million predicted proteins that could hasten the drug discovery process.
According to Meta, AlphaFold is 60 times faster than ESMFold while being less precise. Because it derived predictions from previously unstudied genomic sequences, the ESMFold database is bigger.
Alexander Rives, a research scientist at Meta AI and a co-author of a paper that was published on Thursday in the journal Science, claims that predicting a protein’s structure might aid scientists in comprehending its biological function. In November 2022, Meta posted the paper explaining ESMFold on a preprint server.
According to Dr. Rives, proteins with similar structural similarities frequently have related biological roles. And if you can get a really high resolution structure, you can start considering what these proteins’ true biological functions are.
According to Meta, around a third of the proteins predicted by ESMFold can be done so with high confidence.
During the past ten years, researchers have been attempting to predict protein structure and then function. It has been challenging and expensive for scientists to determine protein structures since proteins constantly fold and refold themselves before forming their final structure. The new AI algorithms are learning to anticipate protein shapes in hours or days as opposed to months or years utilizing microscopes that can examine protein structures at the atomic level.
The predictions were produced by Meta researchers using a type of AI called a big language model, which can forecast text from just a few letters or words. The same technique enables ChatGPT from OpenAI to produce responses that are human-like.
The amino acids that make up a protein’s genetic code were represented by a set of letters that the Meta scientists submitted to the ESMFold program. The missing or obscured bits of the sequence were then filled in by the AI model. The association between known protein sequences and structures that are already well-understood by scientists might then be learned by ESMFold to predict the structures of new proteins after it had created a complete sequence.
According to meta scientists, ESMFold’s power lies in how quickly it can predict protein structures, which enables researchers to explore through enormous genetic databases in search of potential applications in the fields of environment, food, health, and medicine.
Olexandr Isayev, a computational biologist at Carnegie Mellon University who was not involved in the project, noted that while it is a significant accomplishment, it heavily relies on earlier work.
One biotech CEO claims that the accuracy of AlphaFold makes him prefer it to ESMFold. Chris Bahl, chief scientific officer and co-founder of AI Proteins, a Boston-based firm employing artificial intelligence techniques to produce synthetic proteins, explained that the bottleneck isn’t compute and that faster isn’t better; better is more precise.
According to Dr. Rives, a number of university research organizations and biotech companies already employ ESMFold.
Since its debut in 2022, the ESMFold model has received downloads at a pace of about 250,000 per month, with 1,000 protein structures predicted every hour, according to a Meta spokesman.
According to DeepMind, three million protein structures have been viewed in AlphaFold’s database since it was initially made available to researchers and biologists in 2021.
Protein language models like ESMFold, which produce lesser accuracy than models like AlphaFold, aren’t quite there yet in terms of accuracy, according to a DeepMind representative. “Yet, we anticipate that the ESMFold database will contain accurate forecasts in many instances.”
According to Andrew Ferguson, co-founder of the Chicago-based biotech Evozyne and an associate professor of molecular engineering at the University of Chicago, both DeepMind and Meta’s AI prediction models have their advantages and will result in novel discoveries.
Dr. Ferguson stated that the Meta AI model “was a really elegant idea” and that “they are complementary.”
To create its own language model that can anticipate a protein’s biological function without first learning about its structure, Evozyne partnered with the technology company Nvidia Corp. A report published in January and made available on a preprint service describes how Evozyne later utilized this concept to create two proteins.