
22 أكتوبر 2024
OpenAI Announces a New AI Model

OpenAI Announces a New AI Model, Code-Named Strawberry, That Solves Difficult Problems Step by Step
This new model also exceeds human PhD-level performance in physics, biology, and chemistry, as evidenced by its performance on the GPQA (General Physics Question Answering) benchmark. OpenAI’s decision to release an early version of OpenAI o1, called OpenAI o1-preview, highlights their commitment to continuously improving the model while making it available for real-world testing through ChatGPT and trusted API users.
The new model is slower than GPT-4o, and OpenAI says it does not always perform better—in part because, unlike GPT-4o, it cannot search the web and it is not multimodal, meaning it cannot parse images or audio.
Mark Chen, vice president of research at OpenAI, demonstrated the new model to WIRED, using it to solve several problems that its prior model, GPT-4o, cannot. These included an advanced chemistry question and the following mind-bending mathematical puzzle: “A princess is as old as the prince will be when the princess is twice as old as the prince was when the princess’s age was half the sum of their present age. What is the age of the prince and princess?” (The correct answer is that the prince is 30, and the princess is 40).
OpenAI’s Chen says that the new reasoning approach developed by the company shows that advancing AI need not cost ungodly amounts of compute power. “One of the exciting things about the paradigm is we believe that it’ll allow us to ship intelligence cheaper,” he says, “and I think that really is the core mission of our company.”
To demonstrate the advancements of OpenAI o1, OpenAI tested the model on various benchmarks, including competitive programming exams, math tests, and science challenges. The results were remarkable. For instance, on the USA Math Olympiad qualifier (AIME), OpenAI o1 performed at a level comparable to the top 500 math students in the U.S. GPT-4o, by comparison, only solved 12% of the problems. In contrast, OpenAI o1 averaged a 74% success rate, with an impressive 93% accuracy when using consensus among multiple samples.
Source: Wired, Marktechpost
RELATED INSIGHTS

17
FEB
The 22nd Tehran Auction Has Concluded: What Signals Can We Draw from This Event?

COMMUNICATION

Houman Jahangard

24
SEP
If you feel you are receiving a signal, know that it has been designed specifically for you.

SEO

Houman Jahangard

8
OCT
Wi-Fi 7: Revolutionizing Connectivity with Unmatched Speed and Efficiency

COMMUNICATION

Houman Jahangard