Is Meta Struggling With A ‘Significant’ Al Gap?
Despite high-profile investments in artificial intelligence research, Meta has been slow to accept expensive AI-friendly hardware & software systems for its core business, hampering its capability to keep pace with innovation at scale even as it has increasingly relied on artificial intelligence to support its growth.
A corporate document dated Sept. 20 states that Meta CEO Mark Zuckerberg gathered his critical employees for a five-hour review of the company’s computing capabilities, emphasising its capacity to carry out cutting-edge artificial intelligence work.
Meta has a tricky obstacle.
Despite high-profile investments in artificial intelligence research, the social media giant has been slow to accept expensive AI-friendly hardware & software systems for its core business, hampering its capability to keep pace with innovation at scale even as it has increasingly relied on artificial intelligence to support its growth. This knowledge is according to the memo, company statements, & interviews with twelve people known with the changes & who spoke on the condition of the anonymity to discuss the internal competition.
When creating AI, they have a significant gap in their tooling, workflows, and procedures. “They need to invest heavily here,” read the letter, authored by incoming head of infrastructure Santosh Janardhan and put on the company’s internal message board in September but just now being published.
To support AI work, Meta would need to “fundamentally shift their physical infrastructure design, software systems, and approach to providing a stable platform.”
For over a year, the company has been working on a big initiative to overhaul its AI infrastructure. While the business has openly admitted to playing a little catch-up on artificial intelligence hardware trends, specifics of the makeover, including capacity constraints, leadership changes, and a cancelled AI chip project, have not previously been revealed.
In response to questions regarding the memo and the restructure, Meta spokesman Jon Carvill stated that the business has a proven track record in creating & deploying state-of-the-art infrastructure at scale, combined with deep expertise in artificial intelligence research & engineering.
They are confident in their ability to continue expanding the capabilities of their infrastructure to meet their near-term and long-term needs as they bring new AI-powered experiences to their family of apps and consumer products, Carvill added. He refused to say if Meta had abandoned its AI chip.
Janardhan and other executives declined requests for interviews made through the corporation.
According to corporate reports, the redesign increased the company’s capital expenditures by around $4 billion per quarter, nearly doubling its spending by 2021, and caused it to halt or cancel previously scheduled data centre developments in four locations.
Meanwhile, after its Nov. 30 debut, Microsoft-backed OpenAI‘s ChatGPT soared to become the fastest-growing consumer application in history, sparking a race among tech titans to release products using so-called generative artificial intelligence, which, unlike other AI, creates human-like written and visual content in response to prompts.
Falling Behind.
A major cause of contention may be traced back to Meta’s late adoption of the graphics processing unit, or GPU, for AI operations.
GPU processors are exceptionally well-suited to artificial intelligence processing because they can do a huge number of jobs concurrently, lowering the time required to analyse billions of pieces of data.
However, GPUs are more costly than other CPUs, with chipmaker Nvidia Corp. owning 80% of the market and dominating accompanying software.
Until last year, the company primarily executed AI workloads on the company’s fleet of commodity central processing units (CPUs), the computer world’s workhorse processor that has filled data centres for decades but does AI work poorly.
The business also began employing an in-house created custom chip for inference, an AI technique in which computers trained on massive quantities of data make judgements and generate replies to requests.
By 2021, the two-pronged method had shown to be slower and less efficient than one based on GPUs, which were also more adaptable in running various sorts of models than Meta’s processor.
The company declined to comment on performance of their AI technology.
According to four of the sources, as Zuckerberg steered the company towards metaverse – a set of digital worlds enabled by augmented & virtual reality – a capacity crunch was slowing its capability to deploy artificial intelligence to respond to threats such as the rise of social media rival TikTok & Apple-led ad privacy changes.
The blunders piqued the interest of former Meta board member Peter Thiel, who quit without explanation in early 2022.
Thiel informed Zuckerberg and his executives at a board meeting before leaving that they were complacent about Meta’s main social network business while focusing too much on the metaverse, leaving the firm exposed to TikTok’s threat.
Catch-up.
After cancelling a large-scale launch of Meta’s own proprietary inference hardware, which was scheduled for 2022, officials instead changed course and made orders for billions of dollars in Nvidia GPUs that year.
By then, the company had already fallen behind competitors like Google, who had begun deploying its own custom-built version of GPUs, known as the TPU, in 2015.
Executives also reorganised Meta’s AI departments that spring, designating two heads of engineering in the process, including Janardhan & the author of the September message.
According to LinkedIn profiles and a person familiar with the departures, more than a dozen executives departed Meta over the months-long turmoil, resulting in a near-complete shift of AI infrastructure leadership.
Meta then began retooling its data centres to suit the arriving GPUs, which use more power and generate more heat than CPUs and must be grouped closer together with specialised networking.
The facilities required 24 to 32 times the networking capacity & new liquid cooling systems to handle the clusters’ heat, requiring them to be completely restructured, as of Janardhan’s memo & four persons familiar with the project, the details of which had not previously been released.
As the project progressed, the company developed internal plans to begin constructing a new and more ambitious in-house chip that, like a GPU, would be capable of both training and inference. The project, which has not previously been publicised, is expected to be completed in 2025.
Carvill, the Meta spokesperson, stated that the data centre building, which had been halted as the company transitioned to the new architecture, will start later this year.
Trade-offs.
While increasing its GPU capacity, Meta has had nothing to offer so far while competitors such as Microsoft and Google promote public debuts of commercial generative AI technologies.
In February, Chief Financial Officer Susan Li admitted that Meta was not committing much of its present computing resources to generative work, stating that “basically all of the AI capacity is going towards ads, feeds, & Reels,” its TikTok-like short video format popular with younger users.
Meta did not prioritise developing generative AI products until after the November release of ChatGPT. Despite the fact that its research department FAIR, or Facebook AI Research, has been sharing prototypes of the technology since late 2021, they said that the corporation was not focused on turning its well-regarded research into products.
That is changing as investment interest grows. In February, Zuckerberg said the formation of a new top-level generative AI team, claiming that it would “turbocharge” the company’s efforts in the field.
This month, Chief Technology Officer Andrew Bosworth stated that generative AI was the area in which Andrew and Zuckerberg were investing the most attention and that Meta will ship a product this year.
According to two people acquainted with the new team’s work, it is still in its early phases and is focused on developing a foundation model, a basic programme that can subsequently be fine-tuned and customised for other goods.
Conclusion.
The business has been developing generative AI solutions on several teams for more than a year. The company said that the pace of work has picked up in the months after ChatGPT arrived. Let’s see if Meta will be able to conquer the race of ChatGPT!