The predictive protein structure is only starting AI or bringing a giant change for the field of life.

The predictive protein structure is only starting AI or bringing a giant change for the field of life.

58% have been analyzed for more than 50,000 human protein structures in the past for more than half, and about 17% of the amino acids in the human protein group have structural information, while the structure of Alphafold2 predicts this number from 17% to 58. %.

It brings revolution in the field of life sciences, will gradually appear in the next few years to more than ten years.

◎ This reporter Cui Shuang protein structure predicts is an important "holy cup" of biology, and is also one of the most hot research in the field of artificial intelligent falling life sciences.

Recently, my country’s self-developed depth learning protein folding predictive platform TRFOLD comes good news, based on the 14th International Protein Structure Prediction Competition (CASP14) Protein Test Set of 2020 (Alphafold2) The world is the second, which is the best achievement in all public protein structural prediction models in China. my country’s performance in biology field is among the world’s first echelon. From 2018, Alphafold represents the artificial intelligence "participation", and the alphafold2 has obtained the accuracy of the structural biological experiments, calculates the problem of biological drug prediction.

How will artificial intelligence bring to what kind of change in life sciences? Will the protein structure predict that one of the ultimate problems of this biology hang, will it be completely solved by artificial intelligence? Deep learning can be widely used in calculating biological fields. Protein structure prediction is a long-awaited in the field of life sciences. It is also known as the difficulty, high cost, and high progress. But this people think that the problem that needs to be slowly explored in a century has achieved major breakthroughs in recent years: in the 2020 CASP14 competition, the Alphafold2 developed by Google’s DeepMind has achieved the total score (GDT) / 100 achievements, that is It is said that biology has almost obtained a protein structure prediction result in the accuracy of the laboratory method.

This milestone event makes structural biologists feel that they have worked hard with an electron microscope worth 10 million US dollars. Alphafold2 actually did it. "See my opinion, this is the biggest contribution of artificial intelligence to the scientific field, and is also one of the most important scientific breakthroughs in the 21st century.

"Biophysiologist, the principal of Xihu University, is not full of praise. Why is it to predict the protein structure? Miao Hongjiang, the head of the Tianyang Protein Folding Project, explained the Science and Technology Daily reporter," study the protein structure, help understand the role of protein, understand the protein How to exercise its biological function, understand the interaction between proteins and non-proteins, is very important for biology, medicine and pharmacy. "

The traditional method of traditional observations mainly has three, namely nuclear magnetic resonance, X-rays, frozen electron microscopy, but these methods often rely on a large number of test error and expensive equipment, and each structure has taken a few years.

Artificial intelligence is applied to the latest results of protein structure prediction, that is, alphafold2, can predict the protein structure that has high confidence that can be taken before or even a few minutes. "Just starting everyone is still joking, saying that DEEPMIND has stolen the real experiment result, until you see the article and open source code, I dare to believe that this thing really happened.

"Miao Hongjiang smiled, this side proves the shock of Alphafold2 forecast results," This opened the door of artificial intelligence in calculating the wide range of biology, so that people in the whole field can have widely used in this area, this is a truthful double Blind experiments are proof.

"AI predictive results and laboratory levels are quite in 1994. Johnmoult, Johnmoult, launched an international protein structure predictive competition, held a first two years, and the competition is in order to attract computer science, biological physics, etc. Experts from different fields participated in the podification of protein three-dimensional structures. In 2018, artificial intelligence officially participated in the prediction of protein three-dimensional structure, Alphafold first showed the first time, ranked first in 98 participating teams .

Two years later, Alphafold2 brings a real breakthrough that uses a machine learning method to predict the correct structure for almost all proteins, with approximately 2/3 of the protein prediction accuracy to structural biological experiments measurement accuracy. In fact, in the past half, a total of more than 50,000 human protein structures have been analyzed. About 17% of the amino acids in the human protein group have structural information, while the structure of Alphafold2 predicts this number from 17% to 58%, because the amino acid ratio of no fixed structure is large, 58% of structural predictions have been close to the limit. It brings revolution in the field of life sciences, will gradually appear in the next few years to more than ten years. When Yong Jong said that the human protein group can be predicted in a single protein unit, the three-dimensional structure of a single protein is basically predicted by Alphafold2.

Overall, the prediction result is credible, and it is more accurate.

For structural biology, this is a subversive breakthrough.

Some structures that have not been parsed before, now basically already predicted.

This will greatly improve people’s understanding of life processes.

For example, genetics may have accumulated a lot of data, but if you don’t know the protein structure, you can’t study a mutation of protein function. Nowadays, the specific location of each mutation in human genetic disease can be viewed by Alphafold2’s structural prediction, and it is possible to speculate how protein function is affected. For example, DeepMind predict the protein structure, including a large number of drug target proteins, a large number of structures, and key enzymes, and predicted structures are sufficiently accurate.

This is too important for the pharmaceutical industry, isometed to provide a reliable basic basis for drug design and drug optimization.

Single protein structure prediction is just the starting point July this year, DeepMind discloses the source code of Alphafold2, and published the paper in "Nature" to explain the technical details of Alphafold2. "This open source has set a huge waple in the biological community, meaning that biologists finally get rid of the elbow of advanced equipment – these expensive advanced equipment only have conditional configuration, and after this, small Teams or individual researchers have also possible possible to participate in protein research.

"The founder of the Tianyang, Shanghai Jiaotong University Computer", former associate professor Xue Guirong, said that Miao Hongjiang believes that the current single protein structure prediction is just a starting point, more precise side chain optimization, protein dynamic analysis, protein and its ligand (such as small molecules, A series of problems such as DNA, RNA, polypeptide, protein, etc.) have not been solved. The next focus will use the current total protein group association with evolution analysis, establish an accurate chain that interacts between proteins and proteins. road.

With the algorithm model, it is only the beginning, and it is still difficult to move forward. Xue Guorong is clear: "The force is a big constraint, such as Alphafold2, a large amount of data distillation work, their algorithm model is based on 30% real data and 70 % Of the distillation data is trained together, behind the huge integration support.

"The plenty of strength can make the protein structure predict from a single structure to interact, from two two studies to scale, advance from the microstructure to the macro system," There are many protein structures in the biological industry, such as gene sequence probably tens of dozens 100 million sequences. But we only know the sequence, don’t know the structure, this is a big information lack of information.

"Xue Guirong said," Protein usually takes the various functions required for life in the form of a complex or into group. However, many protein complexes are still mystery, and the interaction between proteins has not been identified.

We need sufficient strength to support the entire system, perform protein structure prediction, protein design, research protein interaction, drug research and development, etc., find a new method for precise disease treatment.

"At the same time, in terms of data sources and applications, pharmaceuticals, hospitals, etc. are also required to cooperate and linkage.

"More medical companies, institutions, and artificial intelligence companies in the future, to make this industry together, now just start." Xue Guirong said. (Editor: Wang Zhen, Chen Ji) Sharing let more people see recommendation reading.