Are LLMs a sustainable business?
ChatGPT went from 0 to over 100 million users in less than two months, making it the fastest growing consumer application in history.
But is ChatGPT and the broader collection of generative AI and LLMs a sustainable business model?
Estimates range on costs per ChatGPT response to around $0.01-$0.10 per response depending on the number of words just in cloud compute costs. For scale, Morgan Stanley estimates that serving an answer through generative AI costs about 7x more than a typical search due to the additional compute required.
Different Business Models to Generate ROI from Generative AI
The MLOps Community recently surveyed members about the use of Large Language Models (LLMs) in production inside their companies. The majority of respondents, spanning enterprises of all sizes and industries, cited cost, resource usage, or ROI as a main challenge or concern with LLMs in production.
In digital advertising, most digital placements range from $3–5 CPM (cost per thousands of views), with a premium placement to a targeted segment commanding about $20 CPM, or about $0.02 per impression. In paid search that’s closer to $40 CPM though it’s not exactly an apples-to-apples comparison since mostly advertisers are paying for the clicks, not the views). Consumption-based pricing seeks to solve this on a per-prompt response basis, but given that responses can very broadly in length and complexity, CFOs are reluctant to broadly allow generative AI in the enterprise until costs can be better forecast (not dissimilar to cloud where unmonitored instances can lead to bill shock).
Overall, this suggests that unless generative AI can command much higher pricing than current business models or find a way to dramatically cut costs per prompt, it will struggle to generate substantial and sustainable profits.
The P&L of ML
At Wallaroo we often talk about the P&L of ML (profit & loss of machine learning): that is, is the business value you generate from using ML greater than the costs of running it (headcount costs for deploying and managing + compute costs for running the models)? Generative AI, with its high compute costs, is a perfect example of this principle.
For enterprises looking to profit from generative AI, what are some steps they can take to minimize the costs of scaling such a program? We provide greater details in this post about running complex natural language transformer models but in short, we have helped customers cut compute by 60–80% while still meeting or exceeding business SLAs.
If you are finding that compute is a blocker to achieving sustainable ROI from your AI/ML, drop us a line so we can demonstrate what we can do with your models running your data.