ai
  • Crypto News
  • Ai
  • eSports
  • Bitcoin
  • Ethereum
  • Blockchain
Home»Ai»OpenAI’s new LLM exposes the secrets of how AI really works
Ai

OpenAI’s new LLM exposes the secrets of how AI really works

Share
Facebook Twitter LinkedIn Pinterest Email

“As these AI systems get more powerful, they’re going to get integrated more and more into very important domains,” Leo Gao, a research scientist at OpenAI, told MIT Technology Review in an exclusive preview of the new work. “It’s very important to make sure they’re safe.”

This is still early research. The new model, called a weight-sparse transformer, is far smaller and far less capable than top-tier mass-market models like the firm’s GPT-5, Anthropic’s Claude, and Google DeepMind’s Gemini. At most it’s as capable as GPT-1, a model that OpenAI developed back in 2018, says Gao (though he and his colleagues haven’t done a direct comparison).    

But the aim isn’t to compete with the best in class (at least, not yet). Instead, by looking at how this experimental model works, OpenAI hopes to learn about the hidden mechanisms inside those bigger and better versions of the technology.

It’s interesting research, says Elisenda Grigsby, a mathematician at Boston College who studies how LLMs work and who was not involved in the project: “I’m sure the methods it introduces will have a significant impact.” 

Lee Sharkey, a research scientist at AI startup Goodfire, agrees. “This work aims at the right target and seems well executed,” he says.

Why models are so hard to understand

OpenAI’s work is part of a hot new field of research known as mechanistic interpretability, which is trying to map the internal mechanisms that models use when they carry out different tasks.

That’s harder than it sounds. LLMs are built from neural networks, which consist of nodes, called neurons, arranged in layers. In most networks, each neuron is connected to every other neuron in its adjacent layers. Such a network is known as a dense network.

Dense networks are relatively efficient to train and run, but they spread what they learn across a vast knot of connections. The result is that simple concepts or functions can be split up between neurons in different parts of a model. At the same time, specific neurons can also end up representing multiple different features, a phenomenon known as superposition (a term borrowed from quantum physics). The upshot is that you can’t relate specific parts of a model to specific concepts.

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

Google Deepmind is using Gemini to train agents inside Goat Simulator 3

novembre 13, 2025

Google is still aiming for its “moonshot” 2030 energy goals

novembre 13, 2025

How to Build a Fully Functional Custom GPT-style Conversational AI Locally Using Hugging Face Transformers

novembre 13, 2025

Improving VMware migration workflows with agentic AI

novembre 12, 2025
Add A Comment

Comments are closed.

Top Posts

SwissCryptoDaily.ch delivers the latest cryptocurrency news, market insights, and expert analysis. Stay informed with daily updates from the world of blockchain and digital assets.

We're social. Connect with us:

Facebook X (Twitter) Instagram Pinterest YouTube
Top Insights

OpenAI’s new LLM exposes the secrets of how AI really works

novembre 13, 2025

Here’s Why Ethereum Fusaka Upgrade Might Trigger The Next Explosive Leg Up For ETH

novembre 13, 2025

Ethereum Whale Adds $105M To His ETH Position – $1.33B Bought Since Nov 4

novembre 13, 2025
Get Informed

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

Facebook X (Twitter) Instagram Pinterest
  • About us
  • Get In Touch
  • Cookies Policy
  • Privacy-Policy
  • Terms and Conditions
© 2025 Swisscryptodaily.ch.

Type above and press Enter to search. Press Esc to cancel.