Crypto AI Projects Would Need to Buy Chips Worth Their Entire Market Cap to Meet Ambitions

Crypto AI projects are facing a major hurdle in their quest to revolutionize the market. To achieve their ambitious goals of text-to-video generation, these projects would need to acquire computer chips worth their entire market capitalization. This staggering requirement highlights the massive compute power needed for mainstream adoption.

For instance, OpenAI’s demo of Sora, a text-to-video generator, generated significant enthusiasm in the crypto market. This led to a surge in AI token values. Subsequently, numerous crypto AI projects emerged, aiming to develop text-to-video and text-to-image generation capabilities. As a result, the AI token category now boasts a market capitalization of $25 billion, according to CoinGecko data.

However, the realization of AI-generated videos relies heavily on Graphics Processing Units (GPUs) from companies like Nvidia and AMD. It is estimated that hundreds of thousands of GPUs are needed to make text-to-video generation mainstream. This surpasses the combined GPU usage of major tech companies such as Microsoft, Meta, and Google.

To put this into perspective, a research report by Factorial Funds reveals that supporting TikTok and YouTube’s creator community alone would require approximately 720,000 high-end Nvidia H100 GPUs. Additionally, training Sora requires around 10,500 powerful GPUs per month, and each GPU can generate only five minutes of video per hour for inference. These numbers demonstrate the significant compute power needed for AI-generated video.

Although Nvidia shipped 550,000 H100 GPUs in 2023, it falls short of the demanding requirements. Even considering the GPUs collectively used by major customers, which amounts to 650,000 cards, acquiring the necessary compute power would cost around $21.6 billion. This amount is almost the entire market capitalization of AI tokens at present.

It is worth noting that Nvidia is not the sole player in this field. Competitors like AMD also offer similar products, and companies like Render (RNDR) and Akash Network (AKT) provide GPU computing services. However, these alternatives primarily consist of retail-grade gaming GPUs, which are less powerful than server-grade solutions.

While the idea of text-to-video generation is fascinating and holds potential for revolutionizing creative workflows, the current hardware limitations will likely delay its mainstream adoption. It is clear that more chips and significant technological advancements are required to bridge the gap between AI ambitions and available compute power.