OpenAI and rivals explore alternative techniques to overcome challenges in developing more advanced AI models, facing hurdles including ethical concerns.
The global race to develop increasingly sophisticated AI has entered a new phase as leading companies face unexpected hurdles in their pursuit of ever-larger language models.
This shift comes in a time of growing concerns about the environmental and economic costs of AI development, as well as debates about the societal impacts of rapidly advancing AI capabilities.
Against this backdrop, some of the world’s most prominent AI researchers are now questioning the prevailing “bigger is better” approach that has dominated the field in recent years.
For much of the past decade, AI companies have focused on scaling up their models by using more data and computing power.
This strategy led to breakthroughs like OpenAI’s ChatGPT, which sparked a surge of investment and public interest in AI.
However, researchers are now encountering limitations with this approach, prompting a search for alternative methods to advance AI capabilities.
Challenges in scaling up AI models
According to Reuters, major labs have faced delays and disappointing outcomes in their efforts to surpass the performance of OpenAI’s GPT-4 model, which is nearly two years old.
These challenges stem from several factors:
- The high cost and complexity of training runs for large models, which can require hundreds of chips operating simultaneously for months at a time.
- The increasing scarcity of easily accessible data, as AI models have consumed much of the readily available information on the internet.
- Power shortages hindering training runs, which demand vast amounts of energy.
Ilya Sutskever, Co-Founder of AI labs Safe Superintelligence (SSI) and OpenAI, told Reuters that results from scaling up pre-training – the initial phase of training an AI model using vast amounts of unlabeled data – have plateaued.
“The 2010s were the age of scaling, now we’re back in the age of wonder and discovery once again. Everyone is looking for the next thing,” he says.
OpenAI o1: new approaches to AI development
In response to these challenges, researchers are exploring alternative techniques to enhance AI capabilities.
One promising approach is ‘test-time compute,’ which focuses on improving existing AI models during the inference phase – when the model is being used – rather than solely during the initial training.
This method allows models to dedicate more processing power to challenging tasks that require human-like reasoning and decision-making.
For example, AI companies like OpenAI are recently developing training techniques that use more human-like ways for algorithms, such as OpenAI’s ‘o1’ model that can ‘think’ through problems in a multi-step manner, similar to human reasoning.
A dozen AI scientists, researchers and investors told Reuters they believe that such techniques could rejig the AI arms race.
Noam Brown, a Researcher at OpenAI who worked on o1, highlights the efficiency of this approach at a recent TED AI conference: “It turned out that having a bot think for just 20 seconds in a hand of poker got the same boosting performance as scaling up the model by 100,000x and training it for 100,000 times longer”.
Other major AI labs, including Anthropic, xAI and Google DeepMind, are also reportedly working on their own versions of this technique, according to Reuters.
This shift could have significant implications for the AI industry, potentially altering the competition for AI hardware and influencing investment strategies.
Sonya Huang, a partner at Sequoia Capital, the firm that invests in early-stage technology companies, notes that: “This shift will move us from a world of massive pre-training clusters toward inference clouds, which are distributed, cloud-based servers for inference”.
Ethical concerns of AI development
As the AI industry continues to evolve, companies and researchers are consequently grappling with the challenge of balancing technological advancement with ethical considerations and resource constraints.
However, Kevin Weil, Chief Product Officer at OpenAI, expressed optimism about the potential for rapid progress: “We see a lot of low-hanging fruit that we can go pluck to make these models better very quickly. By the time people do catch up, we’re going to try and be three more steps ahead”.
Yet the shift towards new AI development approaches comes at a time when the limitations of current methods are becoming increasingly apparent.
A recent study from the University of Cambridge and the University of Oslo suggests that AI systems may face inherent limitations due to a century-old mathematical paradox.
The researchers propose that instability is the ‘Achilles’ heel’ of modern AI and that there are problems where stable and accurate neural networks exist, yet no algorithm can produce such a network.
This research highlights the need for a more nuanced understanding of AI’s capabilities and limitations.
As Professor Anders Hansen from Cambridge’s Department of Applied Mathematics and Theoretical Physics states: “If AI systems are used in areas where they can do real harm if they go wrong, trust in those systems has got to be the top priority”.
Yet despite ethical and practical challenges of developing AI, rising chip demand still needs to be met.
Reuters reports that when Nvidia was asked about the possible impact on demand for its products, it pointed to recent company presentations on the importance of the technique behind the o1 model, perhaps to emphasise its relevance to inference tasks, which require significant computing power.
This aligns with Nvidia’s strategy to position its chips as essential for both AI training and inference, potentially driving increased demand for its products.
Nvidia’s CEO, Jensen Huang, has discussed increasing demand for using its chips for inference.
“We’ve now discovered a second scaling law, and this is the scaling law at a time of inference…All of these factors have led to the demand for Blackwell (Nvidia’s latest graphics processing unit) being incredibly high,” he said at a conference recently in India, referring to the company’s latest AI chip.