Sitemap

Enterprise AI at Scale — Weaviate Podcast #120 with Ben Kus and Bob van Luijt

7 min readMay 7, 2025
Weaviate Podcast #120 with Box CTO Ben Kus and Weaviate Co-Founder Bob van Luijt!

“If we were founded today, how would we use AI as part of our platform? And as an unstructured content company, of course AI would be the central piece of the whole thing.” — Ben Kus

The Trillion-File Challenge

When your company manages exabytes of unstructured data across 120,000+ enterprise customers, implementing AI isn’t just a technical challenge — it’s an existential one. In our latest Weaviate podcast episode, Box CTO Ben Kus takes us behind the scenes of how the cloud content management pioneer is navigating the intersection of massive scale, enterprise security, and cutting-edge AI.

The conversation between Weaviate Co-Founder/CEO Bob van Luijt and Ben Kus reveals how Box — a company that’s been doubling in size roughly every year for the past decade — is tackling infrastructure challenges that would make most CTOs wake up in a cold sweat. With “hundreds of billions to trillions” of documents, images, and videos under management, Box offers a fascinating case study in what happens when vector databases meet enterprise requirements.

Key Insights

The Three-Layer Infrastructure Problem

Box approaches their infrastructure challenges through three distinct but interconnected layers, each with its own set of complexities:

At the foundational level, Box deals with core infrastructure challenges — managing exabytes of data with millions of interactions per second. As Ben explains, “When you’re talking like exabyte scale or trillions of things happening, millions per second… there’s just a fundamental reliability distributed system scale” challenge that requires both proprietary solutions and leveraging cutting-edge technologies.

The second layer introduces a unique multi-tenant complexity that differentiates Box from many SaaS platforms. “One of our challenges is everybody shares with everybody,” Ben notes. Unlike systems where tenants can be cleanly separated, Box users frequently share content across organizational boundaries. “When I do a search, I need to find that content you shared with me… so you can’t just cut things off per enterprise.” This creates cascading complexity across their content, preview, search, and AI systems.

The third layer involves securely implementing AI capabilities. “We want to bring all the capabilities safely and securely of Gemini, or OpenAI, or Anthropic… while respecting permissions,” Ben emphasizes. This means ensuring AI can only access content a user has permission to see — a deceptively complex requirement when working with billions of documents across millions of users with intricate permission structures.

The Vector Database Storage Conundrum

One of the most fascinating revelations from the conversation concerns the space efficiency challenges of vector embeddings. As organizations scale their AI implementations, many are discovering that the embeddings themselves can consume more storage than the original content.

Ben shared a striking example: a few hundred bytes of text in a paragraph might require 4–6 kilobytes of vector data when stored as embeddings. “If you’re naive about it,” he explains, “that would cost you more than all customers pay us for the last three years.”

This storage explosion becomes particularly problematic at Box’s scale, where comprehensive indexing strategies become financially unfeasible. To address this, Box has developed sophisticated optimization techniques:

  • Differentiating between file-level embeddings (what the document is about) versus paragraph-level embeddings (specific information within documents)
  • Using hybrid search approaches that combine traditional methods with vector search
  • Calculating embeddings just-in-time rather than pre-computing everything
  • Implementing a multi-stage retrieval process where initial queries reduce the candidate set before applying more computationally intensive techniques

As Bob notes, this aligns with trends they’ve observed at Weaviate: “We literally solved this with our customers… people at some points were like ‘we’re very happy but we’re growing fast, this is getting very expensive.’” This led to innovations like Weaviate’s flat index (warm storage) approaches that combine disk-based storage with caching mechanisms.

Why RAG Remains Essential Despite Growing Context Windows

A particularly insightful moment came when Ben addressed the misconception that larger context windows might eliminate the need for Retrieval Augmented Generation (RAG):

“I saw something recently… because the context windows are so big on an AI model, RAG is almost not needed anymore. But no… whoever said that has a misunderstanding of the benefits of RAG in the enterprise.”

Ben outlined three critical reasons why RAG remains fundamental for enterprise AI:

  1. Data Currency: “You need to get the latest data and the latest data changes constantly.” No matter how large context windows become, they can’t include information that didn’t exist during training.
  2. Permission Controls: “Very critically what you have access to. And the AI should never have access to anything that you don’t have access to.” RAG provides the security layer necessary for enterprise deployments.
  3. Data Volume: Even with 100k+ token context windows, no model can ingest the volumes of data enterprises need to reference simultaneously.

The Rise of Enterprise AI Agents

The conversation shifts to perhaps the most forward-looking topic: how Box is implementing AI agents to transform enterprise workflows.

Ben describes an evolution in their approach to AI agents, from simple question-answering to complex workflow automation. One compelling example involves automating the RFP (Request for Proposal) response process — a traditionally labor-intensive task where teams of specialists spend hours copying and pasting content between systems.

“This is like my agents, or my agent, or the team of them… does that work for me,” Ben explains. “Never before in the history of enterprise software or any technology was it possible to do something as complex, but now agents, it’s not only possible, but sort of possible in a way that will meet some of the enterprise class ways.”

By combining document understanding, intelligent retrieval across repositories, automatic drafting, and content generation capabilities, Box is creating intelligent assistants that can handle tasks that previously required substantial human effort.

Hidden Gems

Beyond the headline-grabbing insights, the conversation revealed several less obvious but equally valuable takeaways:

The Critical Role of Embeddings in Enterprise AI: Ben calls embeddings “the unsung hero of the AI revolution, in particular for the enterprise.” While generative AI gets the spotlight, vector representations form the essential bridge between enterprise data and modern AI systems.

The ChatGPT Turning Point: When asked about the moment Box realized AI would transform their business, Ben identified late 2022 as the inflection point: “When ChatGPT came out and in particular when GPT-3 or 3.5 became available via API in a trustworthy way that was production class, that was the moment.” Prior to this, models showed “flashes of brilliance” but weren’t reliable enough for enterprise deployment.

The Founder-Led Innovation Advantage: In a fascinating tangent about organizational dynamics, Ben discusses how having a founder-CEO (Aaron Levie) who champions AI creates a unique advantage. “We try to maintain the idea of the best of both worlds of a startup-driven culture and a startup-driven mentality… while also maintaining the processes, the system” necessary for enterprise reliability.

Practical Takeaways

For Beginners

Start with a thoughtful implementation strategy rather than trying to vectorize everything. As Ben notes, “A lot of big companies… their first thing is, ‘Oh, I’ll just index all my data’… that company has 30 petabytes.” Instead, focus on high-value use cases where AI can solve specific problems, and expand from there.

For Intermediate Practitioners

Consider implementing a tiered embedding strategy that distinguishes between document-level and granular section-level embeddings. This approach allows you to balance costs with retrieval quality by first narrowing your search space using document-level embeddings before computing more detailed representations on a smaller subset.

For Advanced Users

Explore multi-step retrieval pipelines that combine traditional search methods, metadata filtering, and vector search. As your data grows, the combination of these approaches will provide better performance and cost metrics than relying solely on vector search. Consider implementing “warm storage” patterns similar to Weaviate’s flat index to manage the cost of vector storage at scale.

Weaviate-Specific Relevance

The conversation highlights several areas where Weaviate’s architecture aligns perfectly with enterprise needs:

The discussion around “flat index and what we sometimes refer to as warm storage” directly connects to Weaviate’s storage architecture, which provides flexible options for balancing performance and cost as vector databases scale.

Box’s multi-tenant security challenges mirror the use cases Weaviate addresses with its robust RBAC (Role-Based Access Control) and multi-tenancy features, allowing organizations to implement sophisticated permission models on vector data.

The evolution from basic vector search to intelligent agents maps directly to Weaviate’s journey from vector database to the Query Agent, which understands collection structure and can orchestrate complex operations across multiple collections.

Looking Forward

As Ben notes in the closing segments, we’re entering an era where “it’ll just be normal to say things like, my team of agents are doing this work for me.” The companies that can implement these capabilities while maintaining enterprise-grade security, reliability, and cost efficiency will define the next generation of AI-powered business tools.

For organizations looking to follow Box’s lead in implementing AI at scale, the path forward requires balancing innovation with pragmatism. It means embracing cutting-edge technologies like Weaviate’s vector database while developing the architectural patterns necessary to make them viable at enterprise scale.

The conversation between these industry leaders makes one thing clear: we’re just beginning to understand what’s possible when we combine the power of vector embeddings, large language models, and thoughtfully designed systems that respect enterprise requirements.

--

--

No responses yet