Hugr’s Data Mesh Architecture: Enabling Conversational AI for Scalable Enterprise Analytics
Senior Software Developer, First Line Software
Bridging the Gap Between Humans and Data
Imagine walking into your office tomorrow and asking your AI assistant:
“Show me how customer sentiment changed after our latest product update, and correlate it with churn in the past quarter.”
A few seconds later, your dashboard lights up. The AI has already connected to your internal databases, added the relevant tables, analyzed trends, and summarized insights — all without SQL, manual queries, or BI reports. Just conversation.
This isn’t a scene from a sci-fi movie — it’s the emerging reality of “Talk to Data,” a new AI capability that allows people to interact with complex enterprise datasets as naturally as they talk to colleagues. And at the heart of this movement lies Hugr, an open-source data mesh platform that blends the rigor of data engineering with the intuition of language models.
Let’s look closely at one “Talk to Data” experience — developed by an AI engineer, Vladimir Gribanov.
From Prototype to Platform
The project was open-sourced in May 2025 and has since gained traction among data engineers and AI developers for its practical, modular approach to enterprise analytics.
At its core, Hugr (https://hugr-lab.github.io) provides a unified GraphQL API that enables users to query and join data from multiple structured sources — databases, data lakes, and even REST APIs — without the need to manually manage connectors or transformations. Think of it as a Hasura-style API generator, but optimized for data analysis, geoanalytics, and vector search.
The Data Mesh Approach
One of the central principles behind Hugr is data mesh architecture — an approach that decentralizes data ownership across business domains. In this model, each domain’s experts act as the “data product owners,” responsible for the quality and structure of their datasets. Hugr facilitates this by providing a modular, permission-aware architecture, allowing different teams to manage their data independently while exposing a unified query interface across the organization.
The system supports role-based access control via OpenID Connect, enabling secure and granular authorization for every query, introspection, and data operation.
Technical Foundation and Features
Hugr’s engine runs on DuckDB, an in-memory analytical query engine that excels at handling complex analytical workloads. Its architecture supports:
- Multi-level caching — both in-memory and distributed (via Redis or Memcached)
- Integration with popular object storages — AWS S3, Azure Blob, Google Cloud, Cloudflare R2, and MinIO
- Support for multiple data formats — CSV, Parquet, Excel, GeoJSON, and JSON
- Federated querying — enabling joins across heterogeneous data sources, including REST APIs
- Native support for PostgreSQL – filters, joins and aggregations pushdown
- Vector and semantic search — for AI-driven insights and contextual retrieval
This blend of traditional data engineering with modern AI-oriented features positions Hugr as a bridge between enterprise data management and AI analytics.
“Talk to Data” – The Next Frontier
What truly sets Hugr apart is its “Talk to Data” concept — a conversational interface for querying structured data. Through the Model Context Protocol (MCP) server, Hugr exposes data schema introspection and query execution tools directly to LLMs. This allows AI systems like Anthropic’s Claude or OpenAI models to understand and interact with enterprise datasets via natural language.
Here’s how it works:
- The MCP server connects to a Hugr instance and loads its base schema.
- Using an LLM, it generates natural-language summaries of tables, fields, and relationships.
- When a user asks a question (e.g., “What are the top companies paying doctors in California?”), the model translates it into a GraphQL query, executes it via Hugr, and returns the analytical result.
In demos, this system successfully analyzed synthetic healthcare data and open payment datasets, producing contextual summaries and visualizable results in seconds. The integration of AI-assisted querying with a real data engine illustrates the potential of natural language data exploration — a key milestone toward autonomous data intelligence.

Managed AI Services: The Future Built on Platforms Like Hugr
The rise of Managed AI Services marks the next evolution in enterprise technology — where organizations don’t just consume AI models, but operate data-aware AI ecosystems that understand and act on their own structured data.
Platforms like Hugr are the foundation layer for this transformation. By providing unified data access, governance, and vectorized search capabilities, Hugr can serve as the data backbone for managed AI systems that continuously learn, analyze, and optimize business operations.
Imagine an AI system built on Hugr that:
- Connects securely to enterprise data lakes and APIs to maintain a live, AI-ready semantic model of business data.
- Deploys specialized AI agents (sales, finance, operations) that query Hugr’s unified API through natural language.
- Monitors data usage and performance with built-in observability and access controls.
- Automatically generates insights, dashboards, and alerts — without manual prompt engineering or SQL scripting.
- Integrates with enterprise tools such as Power BI, Salesforce, or custom LLM applications.
Such a system could turn Hugr into the “data hub” for fully managed AI analytics environments — reducing friction between raw data and intelligent automation. Enterprises could deploy Hugr as a self-hosted service or consume it through a cloud-based managed offering, aligning with trends in AI governance, privacy, and compliance.
In this vision, Managed AI Services are no longer abstract — they are composable, transparent, and powered by data platforms like Hugr that bring structure and semantics to the AI layer.

Roadmap and Future Vision
Hugr is still in active development, with an ambitious roadmap that includes:
- Native Data Lake support for modern formats
- Pushdown optimization to delegate aggregations and joins to connected databases
- Streaming ingestion for real-time analytics
- Integration with BI tools via Microsoft OData REST protocols
- Enhanced session and result management for collaborative data analysis
- Multi-agent and offline modes for flexible, distributed AI workflows
Also plan to extend the project’s ecosystem with a Python client, Spark integration, and agent-mode support for independent LLM-based analytics.
An Open Invitation
As an open-source initiative, Hugr welcomes contributors and collaborators passionate about data, AI, and distributed systems. The project is already proving how AI and data engineering can converge to create systems that “understand” enterprise data — not just store it.
“The hardest part isn’t querying data. It’s understanding how that data is structured and what it means. With Hugr, we’re teaching AI to do exactly that”, said AI engineer Vladimir Gribanov.
Key Takeaway
Hugr’s Talk-to-Data approach redefines enterprise analytics by merging GraphQL-based data federation with AI-driven natural language interaction. Combined with the emerging layer of Managed AI Services, it paints a vision of enterprises where AI doesn’t just analyze data — it lives in it.
Join the movement that’s redefining how humans and machines collaborate over data. Follow us.
November 2025