Since launching Theoriq’s Incentivized Testnet and witnessing the rise of AI Agent across crypto Twitter (X), interest in decentralized AI Agents has surged. AI has dominated traditional sectors this year, and now its influence is rapidly entering Web3. As this wave grows, we anticipate innovative uses for Web3-enabled AI Agents and Collectives.
Our testnet exceeded expectations, with over 23 million Agent interactions since October, averaging over 600K daily. These numbers validate demand but also reveal the complexity of building scalable, cost-efficient, multi-agent systems. Delivering accurate, functional Agents is just the start; optimizing for cost while maintaining quality has been a whole other dance. You can see our latest metric here.
This blog dives into lessons learned on balancing performance, cost, and adaptability—key to sustainable growth—and explores the advancements we're exploring which is shaping the future of decentralized AI.
AI Agents have rapidly taken hold of the industry, and it's clear they’re here to stay and will soon become part of everyday vocabulary. AI agents will soon be an integral part of daily life as common as mobile apps. At Theoriq, we've championed the benefits and transformative role of AI Agents from the start. Now that demand is here, our focus is on optimization and managing budgets.
Theoriq’s infrastructure has enabled the development of AI Agents whose performance has exceeded expectations, with community feedback affirming their value. The growth in usage led us to further review and implement best practices about cost management for AI Agent development, these included comparing methods and costs of different AI service providers, and measuring the expected latency of each.
The team explored various methods to optimize Theoriq’s Agent infrastructure, there is a delicate dance in the early stages of a fresh industry like decentralized AI, where the vision is rooted in the decentralized ethos, but the early iterations of the platform still require the robustness of centralized AI solution providers. Although this is the case for us, the team have integrated the use of decentralized providers, like Anthropic and their Claude-3-5-sonnet/haiku LLM, where possible, and are investigating innovative decentralized AI providers.
The majority of Theoriq’s Agent infrastructure integrated Anthropic’s claude-3-haiku, some smaller parts required better reasoning abilities and used OpenAI gpt-4o-mini, and some components used Google's gemini-1.5-flash where we needed a faster and more cost efficient solution. By using a combination of differing service providers, and adjusting the needs based on real time usage, we were able to keep our Agents working efficiently, without blowing out the bank.
The team collated their findings into an interactive bar chart that showcases the weighted costs per 1M tokens in USD for various AI models. Tokens, in this context, refers to the smallest units of text processed by AI models, such as words, subwords, or characters. They serve as input or output for neural networks, allowing models to analyze and generate text, and are integral to LLMs and NLP. The team found that a balanced, multi-model approach is the best path forward for keeping costs manageable while maximizing functionality.
The AI Revolution will not be Centralized! Join Theoriq on a mission to govern AI through responsible, inclusive, and reflectiv…
After examining the best approach for our infrastructure needs, the team started to encounter an interconnected web of challenges that had to be assessed and addressed in real time. The main challenges were affecting rate limits, call costs, latency, as well as managing relevancy filtering.
Rate limits
The team required an upgrade in rate limits within 24 hours of launching testnet, we were surprised by how quickly this occurred, but in hindsight it reflects the appetite that users have for testing out groundbreaking technology. The massive amount of Agent requests meant that we had to implement fallbacks to handle the influx of use, for example, if the Anthropic limit hit, then we had the OpenAI LLM ready to process Agent requests. The team was on hand 24/7 to make sure these operated smoothly.
Call Costs
Another challenge was managing the high costs that come with the high number of Agent requests. To address this, we optimized our prompts to be effective with smaller, more cost-efficient models. A common strategy involves using a large LLM to generate synthetic data that can then be utilized by smaller, cheaper models.
Latency
A key challenge faced was managing latency to ensure that our Agents deliver efficient, real-time responses. Latency, which measures the time taken from when an AI system receives an instruction to when it completes the task and provides an output, was a significant hurdle. We assessed various methods and services, ultimately finding that the best approach was to use smaller models whenever possible. Additionally, we leveraged both traditional caching and prompt caching features. This combination significantly reduced response times and optimized our system's performance.
Relevancy Filtering
Relevancy filtering has emerged as another method for enhancing the efficiency of AI Agents on the platform. In Web3, incentivizing activity can attract users who exploit rewards through spam-like or disingenuous responses. To address this, our team has launched a ‘Relevancy Agent’ which is used by Agent Collectives to determine the relevance of a user's query. This promotes a more genuine and productive user environment, ensuring that interactions remain constructive.
From experiencing the above, the research team have been exploring several strategies that can boost AI agent efficiency while keeping costs in check. They have been tinkering with the idea of AI Agents with memory, and examining the latest AI advancements.
Imagine an AI agent that can remember past interactions for specific users or wallet interactions, much like a human coworker. The ability to retain context significantly enhances user experience and operational efficiency. By incorporating memory and recall features, agents can avoid repetitive tasks, leading to substantial cost savings over time. For instance, if an agent recalls a specific SQL query used in a prior interaction, it can reuse that information instead of generating it from scratch, saving both time and resources.
The team is also staying ahead of the curve by closely monitoring and integrating the latest advancements from leading AI innovators. For example, the team has been exploring OpenAI’s predicted output method, which helps reduce latency by pre-computing parts of model responses when the outcome is mostly predictable.
Additionally, the team is diving deep into cost management strategies, including insights from the latest research by Sayash Kapoor et al. at Princeton, titled “AI Agents That Matter,” which evaluates current benchmarks and practices in AI Agent development. Staying at the forefront of AI advancements and continuously testing new approaches remains a core priority at Theoriq.
Each of these methods is essential in building a solid foundation for efficient AI Agents functioning on Theoiq’s infrastructure. These lessons have been instrumental in the platform's growth, but the overarching vision remains the same: the future of AI agents lies in their ability to work together as Collectives.
By developing a multi-agent protocol, we enable diverse agents to communicate, share insights, and learn from each other’s experiences. With this collaborative framework as our north star, we’ll not only promote innovative practices like those above but also continually enhance the overall efficiency and adaptability of AI systems.
Ironically enough, the team have found that combining agents into collectives, and assessing and keeping those Agents accountable to specific constraints, should prove to be more cost effective than specialized AI Agents.
Do you have an AI Agent Collective that is performing well, but is blowing out a marketing budget? Maybe an Agent in that collective could be swapped out for another that gets the task done in a more cost effective way. This efficient formation of Collectives then becomes a more dynamic process.
Launching our incentivized testnet and achieving over 23 million interactions in a few short weeks shows the demand for AI Agents in the decentralized space is real. But it has also revealed the challenges of creating scalable, cost-effective multi-agent systems.
Optimizing for efficiency, reducing costs, and tackling latency are not just technical hurdles—they are the foundation of sustainable AI in Web3. By exploring advanced strategies like memory-enabled agents, efficient cost management, and innovative collaboration between AI Agents, we’re shaping a more adaptive and resilient future.
At Theoriq, our vision extends beyond individual AI Agents. We see a world where Agent collectives collaborate seamlessly to solve complex problems, transforming industries and unlocking new possibilities in Web3. We’re not just building for today’s challenges; we’re paving the way for an AI-driven, decentralized tomorrow.
Theoriq is committed to building a responsible, inclusive, and consensus-driven AI landscape in Web3. At the forefront of integrating AI with blockchain technology, Theoriq empowers the community to leverage cutting-edge AI Agent collectives to improve decision-making, automation, and user experiences across Web3.
Theoriq is a decentralized protocol for governing multi-agent systems built by integrating AI with blockchain technology. The platform supports a flexible and modular base layer that powers an ecosystem of dynamic AI Agent collectives that are interoperable, composable and decentralized.
By harnessing the decentralized nature of web3, Theoriq is unlocking the potential of collective AI by empowering communities, developers, researchers, and AI enthusiasts to actively shape the future of decentralized AI.
Theoriq has raised over $10.4M and is backed by Hack VC, Foresight Ventures, Inception Capital, HTX Ventures and more, and have joined start-up programs with Google Cloud and NVIDIA.