Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Deepmind claims that AI performs better than the International Mathematical Olympics gold medalists ncvrs.com

Google Deepmind, ai system, Google’s leading AI Research Lab, seems to have exceeded the average gold medal in solving geometric problems in an international mathematical competition.

Alphageometry2 is a system, an improved version of Alphageometry, released by Deepmind last January. THE newly published studyThe Deepmind researchers behind Alphageometry2 claim that AI can solve 84% of all geometric problems over the past 25 years at the International Mathematical Olympics (IMO), high school mathematics competition.

Why are Deepmind interested in a high school level mathematical competition? Well, the laboratory believes that the key to a more capable AI lies in the discovery of new methods to solve the challenging geometric problems – especially Euclideai Geometric problemsOr

To prove mathematical items, or logically explains why an item (eg Pytagorai) requires both reasoning and ability to choose from possible steps to the solution. These problem-solving skills-if you have the right of Deepmind, can be a useful component of future general AI models.

In fact, last summer, Deepmind demonstrated a system that combined Alphageometry2 with Alphaproof, AI model of formal mathematical reasoning to solve four of the 2024 IMO six problems. In addition to geometric problems, such approaches can be extended to other areas of mathematics and science, such as promoting complex engineering calculations.

Alphageometry2 has many basic elements, including the language model of the Google Gemini AI family and a “symbolic engine”. The Gemini model helps the symbolic engine that uses mathematical rules to infer the solutions to the problems, for implementable evidence for a particular geometric movement.

A typical geometric diagram in IMO.
A typical geometric problem in an IMO exam.Image loans:Google (opens in a new window)

The geometric problems of the Olympics are based on diagrams that need “constructs” that need to be added before they can be solved, such as points, lines or circles. The Gemini model of Alphageometry2 predicts how useful the constructions can be to add a diagram the engine refers to deductions.

Basically, the Alphageometry2 Gemini model recommends steps and constructs in the formal mathematical language of the engine, which, according to specific rules, checks these steps in terms of logical consistency. The search algorithm allows Alphageometry2 to make a variety of solutions in parallel and to store useful results on a well -known basis.

Alphageometry2 believes that the problem must be “solved” when it comes to evidence that combines the Gemini model’s suggestions with the well -known principles of the symbolic engine.

Due to the complexity of the translation of evidence into a format that AI can understand, the usable geometric training data is missing. So Deepmind created its own synthetic data to train Alphageometry2, more than 300 million items and evidence of changing complexity.

The Deepmind team has selected 45 geometric problems from IMO competitions over the past 25 years (2000 to 2024), including linear equations and equations that require geometric objects around a plane. They were then “translated” into a series of more 50 problems. (For technical reasons, some problems had to be divided into two.)

According to the paper, Alphageometry2 solved 42 problems and cleaned the average gold medal score 40.9.

Obviously there are restrictions. The technical quirk prevents alfageometry from solving problems with a variable number of points, nonlinear equations and inequalities. And the alfageometry2 no technically The first AI system that reaches gold medal levels in geometry, though this is the first one to reach this size problem.

Alphageometry2 was worse on another harder IMO problem. For the further challenge, the Deepmind team chose problems – 29 in total – nominated by mathematical experts for IMO exams, but this has not yet appeared in the competition. Alphageometry2 could only solve 20 of them.

However, test results are likely to encourage the debate on whether AI systems should be built on symbol manipulation, or manipulating symbols that represent rules or seemingly brain-like nervous networks.

Alphageometry2 uses a hybrid approach: Gemini’s model has a neural network architecture, while its symbolic engine is rule.

Supporters of neural network techniques claim that intelligent behavior, from speech recognition to image generation, can only come from a huge amount of data and computing. Resistant to symbolic systems that solve tasks by defining the set of symbol manipulation rules dedicated to each task, such as editing a series of tasks in Word Processor software, by statistical approximation of neural networks and learning from examples.

Neural networks are cornerstones of high -performance AI systems like the Openai O1 “argument” model. But the symbolic AI supporters, they are not all ultimate; The symbolic AI may be better off, effectively encoding the world’s knowledge, justifying complex scenarios, and “explaining” how they have come to the answer, the fans say.

“It is surprising to see the contrast between continuing this type of reference values, spectacular advancement, while language models, including” reasoning “, continue to fight some simple public interest problems,” professor specializing in university computing told Techcrunch. “I don’t think it’s all smoke and mirrors, but he illustrates that we still don’t know what behavior is expected from the next system. These systems are likely to have a very impact, so we need to understand them and the risks they report urgently. ”

Alphageometry2 is likely to show that the two approaches – the manipulation of the symbol and the nerve networks – combined They show a promising path in the search for generalizable AI. In fact, according to Deepmind paper, the O1, which also has a neural network architecture, was unable to solve IMO problems to which Alphageometry2 could answer.

This may not be the case forever. In the article, the Deepmind team said it had found preliminary evidence that Alphageometry2’s language model was able to create partial solutions to the problems with the help of a symbolic engine.

“(A) The results support the ideas that large language models can be self-sufficient, depending on external devices (such as symbolic engines),” the Deepmind team wrote in the paper, but until the (model) speed improves, And the hallucinations are completely resolved, and the tools remain essential for mathematical applications. ”

Leave a Reply

Your email address will not be published. Required fields are marked *