With o3 having reached AGI, OpenAI turns its sights toward superintelligence

OpenAI CEO Sam Altman has reinvigorated discussion of artificial general intelligence (AGI), boldly claiming that his company’s newest model has reached that milestone.In an interview with Bloomberg, he noted that OpenAI’s o3, which was announced in December and is currently being safety tested, has passed the ARC-AGI challenge, the leading benchmark for AGI. Now, Altman said, the company is setting its sights on superintelligence, which is leaps and bounds beyond AGI, just as AGI is to AI. According to ARC-AGI, “OpenAI’s new o3 system — trained on the ARC-AGI-1 Public Training set — has scored a breakthrough 75.7% on the Semi-Private Evaluation set at our stated public leaderboard $10k compute limit. A high-compute (172x) o3 configuration scored 87.5%.”The benchmark identifies a score of 85% as a “pass” for AGI. In contrast, humans can solve an average of 80% of all ARC tasks.“We are now confident we know how to build AGI as we have traditionally understood it,” Altman wrote in a blog, “Reflections,” posted on Sunday. “We believe that, in 2025, we may see the first AI agents ‘join the workforce’ and materially change the output of companies.”What exactly is AGI?One of the challenges of achieving AGI is defining it. As of yet, researchers and the broader industry do not have a concrete description of what it will be and what it will be able to do.The general consensus, though, is that AGI will possess human-level intelligence, be autonomous, have self-understanding, and will be able to “reason” and perform tasks that it was not trained to do.Altman, for his part, loosely defined AGI as “when an AI system can do what very skilled humans in important jobs can do.” He posited that it could be “the most impactful technology in human history.”Going beyond AGI, “superintelligence” is generally understood to be AI systems that far surpass human intelligence.“With superintelligence, we can do anything else,” Altman wrote. “Superintelligent tools could massively accelerate scientific discovery and innovation well beyond what we are capable of doing on our own.”He added, “this sounds like science fiction right now, and somewhat crazy to even talk about it.” However, “we’re pretty confident that in the next few years, everyone will see what we see,” he said, emphasizing the need to act “with great care” while still maximizing benefit.ARC-AGI explainedARC-AGI is the abbreviation of “Abstract and Reasoning Corpus for Artificial General Intelligence.” It was introduced in 2019 by renowned AI researcher François Chollet, who created the Keras deep learning framework, and says that AGI is “a system that can efficiently acquire new skills outside of its training data.”According to the ARC-AGI website: “The intelligence of a system is a measure of its skill-acquisition efficiency over a scope of tasks, with respect to priors, experience, and generalization difficulty.” This means that AI can adapt to new, unanticipated problems it has never seen.The benchmark is based on what’s known as the “abstract reasoning corpus” and essentially presents AI with abstract grids or puzzles that require human-level understanding of concepts including objects, boundaries and spatial relationships. Each input-output task presents 10 squares of varying heights and widths that can be one of 10 colors. To solve each puzzle, an AI system needs to rely on reasoning.For instance, a model might be presented with a 7X7 grid with 3 teal blocks creating an “L” pattern and another three a reverse lower case “r.” Based on its analysis of examples of intended outputs, it must then reason that it needs to fill out the “L” and “r” with a bright blue block to create two squares. (Give it a try).However, while ARC-AGI claims to be the “only AI benchmark that measures our progress towards general intelligence,” others have presented different measurements.For example, in 2023 researchers in Beijing introduced the Tong test, which evaluates AGI through “dynamic embodied physical and social interactions.” This method proposes five critical characteristics to identify AGI: Models must have the ability to perform non-predefined infinite tasks (many existing models can only achieve a handful); autonomously generate tasks without fine-grained instructions or prompts from humans; have a system that can learn and anticipate human needs; have causal understanding (the ability to identify cause and effect); and have “embodiment” (physical or virtual presence) that allows it to participate in human life.Beyond ‘sparks’ of AGIOpenAI introduced its o3 model as part of its “12 Days of OpenAI” in December, providing safety researchers early access to its o3 frontier models to complement existing testing processes including “rigorous” internal safety testing, external red teaming, and collaborations with third-party organizations and national safety institutes. The company is accepting applications for the early access program through the end of this week (January 10).OpenAI set out to build AGI from its founding in 2015, when the concept of AGI, as Altman put it to Bloomberg, was “nonmainstream.”“We wanted to figure out how to build it and make it broadly beneficial,” he wrote in his blog post. “At the time, very few people cared, and if they did, it was mostly because they thought we had no chance of success.”But the company recruited talent with the lure “just come build AGI,” and in April 2023 appeared to have at least made some strides toward it: Microsoft researchers said that ChatGPT had “sparks” of AGI. They demonstrated that “beyond its mastery of language” GPT-4 could solve “novel and difficult tasks” including math, coding, vision, medicine, law, psychology, and more, without special prompting.As Altman noted, “there is still so much to understand, still so much we don’t know, and it’s still so early. But we know a lot more than we did when we started.”

featured-image

OpenAI CEO Sam Altman has reinvigorated discussion of artificial general intelligence (AGI), boldly claiming that his company’s newest model has reached that milestone. In an , he noted that OpenAI’s o3, which was announced in December and is currently being safety tested, has passed the ARC-AGI challenge, the leading benchmark for AGI. Now, Altman said, the company is setting its sights on superintelligence, which is leaps and bounds beyond AGI, just as AGI is to AI.

According to ARC-AGI, “OpenAI’s new o3 system — trained on the ARC-AGI-1 Public Training set — has scored a breakthrough 75.7% on the Semi-Private Evaluation set at our stated public leaderboard $10k compute limit. A high-compute (172x) o3 configuration scored 87.



5%.” The benchmark identifies a score of 85% as a “pass” for AGI. In contrast, humans can solve an of all ARC tasks.

“We are now confident we know how to build AGI as we have traditionally understood it,” Altman wrote in a blog, “ ,” posted on Sunday. “We believe that, in 2025, we may see the first AI agents ‘join the workforce’ and materially change the output of companies.” What exactly is AGI? One of the challenges of achieving AGI is defining it.

As of yet, researchers and the broader industry do not have a concrete description of what it will be and what it will be able to do. The general consensus, though, is that AGI will possess human-level intelligence, be autonomous, have self-understanding, and will be able to “reason” and perform tasks that it was not trained to do. Altman, for his part, loosely defined AGI as “when an AI system can do what very skilled humans in important jobs can do.

” He posited that it could be “the most impactful technology in human history.” Going beyond AGI, “superintelligence” is generally understood to be AI systems that far surpass human intelligence. “With superintelligence, we can do anything else,” Altman wrote.

“Superintelligent tools could massively accelerate scientific discovery and innovation well beyond what we are capable of doing on our own.” He added, “this sounds like science fiction right now, and somewhat crazy to even talk about it.” However, “we’re pretty confident that in the next few years, everyone will see what we see,” he said, emphasizing the need to act “with great care” while still maximizing benefit.

ARC-AGI explained ARC-AGI is the abbreviation of “Abstract and Reasoning Corpus for Artificial General Intelligence.” It was introduced in 2019 by renowned AI researcher François Chollet, who created the Keras deep learning framework, and says that AGI is “a system that can efficiently acquire new skills outside of its training data.” According to the : “The intelligence of a system is a measure of its skill-acquisition efficiency over a scope of tasks, with respect to priors, experience, and generalization difficulty.

” This means that AI can adapt to new, unanticipated problems it has never seen. The benchmark is based on what’s known as the “abstract reasoning corpus” and essentially presents AI with abstract grids or puzzles that require human-level understanding of concepts including objects, boundaries and spatial relationships. Each input-output task presents 10 squares of varying heights and widths that can be one of 10 colors.

To solve each puzzle, an AI system needs to rely on reasoning. For instance, a model might be presented with a 7X7 grid with 3 teal blocks creating an “L” pattern and another three a reverse lower case “r.” Based on its analysis of examples of intended outputs, it must then reason that it needs to fill out the “L” and “r” with a bright blue block to create two squares.

( ). However, while ARC-AGI claims to be the “only AI benchmark that measures our progress towards general intelligence,” others have presented different measurements. For example, in 2023 researchers in Beijing introduced the , which evaluates AGI through “dynamic embodied physical and social interactions.

” This method proposes five critical characteristics to identify AGI: Models must have the ability to perform non-predefined infinite tasks (many existing models can only achieve a handful); autonomously generate tasks without fine-grained instructions or prompts from humans; have a system that can learn and anticipate human needs; have causal understanding (the ability to identify cause and effect); and have “embodiment” (physical or virtual presence) that allows it to participate in human life. Beyond ‘sparks’ of AGI OpenAI introduced its o3 model as part of its “ ” in December, providing safety researchers early access to its o3 frontier models to complement existing testing processes including “rigorous” internal safety testing, external red teaming, and collaborations with third-party organizations and national safety institutes. The company is for the early access program through the end of this week (January 10).

OpenAI set out to build AGI from its founding in 2015, when the concept of AGI, as Altman put it to Bloomberg, was “nonmainstream.” “We wanted to figure out how to build it and make it broadly beneficial,” he wrote in his blog post. “At the time, very few people cared, and if they did, it was mostly because they thought we had no chance of success.

” But the company recruited talent with the lure “just come build AGI,” and in April 2023 appeared to have at least made some strides toward it: Microsoft researchers said that ChatGPT had “ ” of AGI. They demonstrated that “beyond its mastery of language” GPT-4 could solve “novel and difficult tasks” including math, coding, vision, medicine, law, psychology, and more, without special prompting. As Altman noted, “there is still so much to understand, still so much we don’t know, and it’s still so early.

But we know a lot more than we did when we started.”.