Alright, imagine you’re a business leader or IT decision-maker eager to delve into the realm of generative AI. After hearing about the potential benefits, you’re ready to introduce a large language model (LLM) chatbot to your team or customers. The challenge now is how to launch it and determine the associated costs.
Here’s where DeepInfra comes in. Founded by former engineers from IMO Messenger, DeepInfra offers a solution to swiftly get these models running on your private servers. What’s particularly enticing is their competitive pricing at $1 per 1 million tokens for both input and output. This is in stark contrast to the $10 per 1 million tokens for OpenAI’s GPT-4 Turbo and $11.02 per 1 million tokens for Anthropic’s Claude 2.
DeepInfra, making its debut exclusively to VentureBeat, has secured an $8 million seed round led by A.Capital and Felicis. The company plans to provide various open source model inferences to customers, including Meta’s Llama 2 and CodeLlama, along with customized versions of these and other open source models.
Nikola Borisov, Founder and CEO of DeepInfra, highlighted their focus on offering efficient and cost-effective deployment of trained machine learning models, emphasizing the importance of the inference side of things.
DeepInfra addresses the often-overlooked challenge of efficiently running these large language models in real-world scenarios. Borisov explained that fitting numerous concurrent users onto the same hardware and model simultaneously poses a challenge, especially considering the substantial computation and memory bandwidth required for each token.
DeepInfra’s co-founders, drawing on their experience at IMO Messenger with 200 million users, leverage their expertise in running large server fleets globally to optimize usage and gain efficiencies in serving concurrent users.
Aydin Senkut, founder and managing partner of Felicis, praised the co-founders as “international programming Olympic gold medal winners,” emphasizing their capability to build efficient infrastructure for serving hundreds of millions of people.
DeepInfra’s low costs, attributed to their efficiency in building server infrastructure and compute resources, make them an appealing option in a market where cost is a significant concern. Senkut believes that having up to a 10x cost advantage could position DeepInfra as a major disruptor in the AI and LLM market.
Initially targeting small-to-medium sized businesses (SMBs), DeepInfra aims to cater to cost-sensitive customers seeking access to state-of-the-art open source language models and other machine learning models. The company plans to closely monitor the open source AI community, offering hosting services for emerging models and staying adaptable to evolving customer needs.
DeepInfra also emphasizes its commitment to data privacy and security, assuring customers that they don’t store or use any input prompts. This focus on privacy, coupled with their cost-effective hosting services, positions DeepInfra as a compelling choice for enterprises looking to leverage LLM technology affordably in various applications.