DES is a property of AIM Media House.

Data Engineering Summit 2026 speaker presenting on stage.

SCHEDULE

MAY. 14-15 2026
Hotel Radisson Blu, ORR / Bangalore,

MAY. 14-15 2026
Bengaluru

SCHEDULE

We are in the process of finalizing the schedule for 2026. Please check back this space again.

Expect more than 60 speakers to speak at DES 2026. To explore speaking opportunities with DES, write to info@aim.media

Expand All +
  • Day 1 | HALL 2 - Practical Insights and Best Practices


  • How L&T Finance is strengthening its data foundation to support scalable, reliable AI use cases—focusing on data quality, governance, and real-time accessibility.

  • This talk will introduce hybrid search with Apache Doris as a next-generation retrieval solution for generative AI and context engineering. Addressing the semantic confusion problem by combining vector search, full-text search, and SQL to capture the exact-match intent with semantic similarity, resulting in a more accurate and cost-effective solution. Matt will then touch on how the native real-time capability that Apache Doris brings to OLAP can be extended to real-time RAG, helping organizations think about future challenges in this space.

  • RisingWave is redefining real-time data processing with its next-gen streaming database architecture. This session explores how organizations can move beyond batch systems to build low-latency, scalable pipelines that power instant insights thereby reducing operational complexity of traditional stream processing stacks

  • Traditional data lakes often struggle to deliver reliable and actionable insights at scale. This session shares how redBus evolved to a modern lakehouse architecture, built a unified knowledge layer for context and governance, and enabled AI-powered, natural language data access—transforming its data platform into a real-time, decision-centric system.

  • Traditional data pipelines were designed to move and transform data efficiently—but in today’s AI-driven world, that is no longer enough. This session explores the evolution from DataOps to AI-native pipelines, where data systems are not just enablers but active participants in intelligence generation. We will dive into how modern architectures integrate real-time data processing, feature engineering, model deployment, and feedback loops into a unified AI pipeline. Attendees will gain insights into designing scalable, reliable, and cost-efficient systems using technologies like Spark, Kafka, and cloud-native platforms.

  • In high-scale platform ecosystems- where a single backend powers diverse business verticals , the traditional boundaries between "Application Engineering" and "Data Engineering" are dissolving. When processing millions of transactions daily, data integrity cannot be a post-facto batch process; it must be an architectural first principle. This session explores the "Truth Platform" pattern, a blueprint for building near real-time (NRT) reconciliation and observability into the core of a multi-tenant business platform. We will move away from the "data as exhaust" mindset and dive into three transformative engineering patterns: Cohorted Observability, Recon as a contract and Handshake Pattern

  • Day 1 | Main Hall - Thought Leadership and Strategic Insights


  • With tool sprawl, hidden costs, and integration nightmares, is the MDS era over? Panel can debate the rise of unified platforms vs. best-of-breed, and what the next architecture era looks like.

  • As organizations rush to layer AI on top of their data warehouses, most are discovering an uncomfortable truth. AI-driven analytics is only as trustworthy as the metadata underneath it. Without lineage, ownership, and semantic context, even the best models confidently produce wrong answers, and no one can tell until a dashboard lies to an executive. In this talk, I'll share how we built a data catalog that serves as the backbone of our AI-driven analysis infrastructure. We'll walk through the pillars that make it work: end-to-end lineage from source systems to dashboards, blast-radius analysis that prevents silent breakage, documentation and accountability that AI agents can actually reason over, and a metrics layer that turns all of this into reliable, explainable answers. You'll leave with a concrete blueprint for why catalogs aren't documentation tools, they're infrastructure, and why getting this foundation right is the difference between AI that hallucinates and AI that works.

  • In this session, we’ll explore how large analytics enterprises can evolve from traditional data systems to AI-driven, modular architectures. The talk introduces five engineering pillars—Data Foundations, AI Agent Building (including persona-based “ask” capabilities), Agent Catalog & Monitoring, Internal Agent Marketplace, and Business Integration. Attendees will learn how data engineers can design semantic layers, build reusable AI agents for merchandising and customer analytics, and integrate them into live business workflows. The session highlights practical strategies for measuring impact, optimizing AI performance, and creating scalable solutions that drive measurable business outcomes.

  • Creating a single source of truth for customer data is no longer optional. This session will dive into the strategies, architectures, and governance models needed to integrate fragmented data systems into a unified ecosystem that powers seamless experiences and data-driven growth.

  • What does data governance look like when your biggest data consumer doesn't sleep, doesn't follow SOPs, and makes thousands of decisions every second? When a majority of your users bypass official AI tools entirely? When a single ungoverned prompt can leak training data, customer PII, or regulated information across your entire enterprise? These aren't hypothetical questions — they're the operational reality of GenAI in 2026. And the governance frameworks most enterprises rely on were built for an era when data consumers were human, deterministic, and slow. This talk presents DataGovOps: governance as code, dynamic access policies, AI-assisted classification, and end-to-end lineage from source through agent output. We'll explore why legacy governance fails under agentic workloads, how to architect consent and access as programmable policy, the recursion problem of governing AI with AI, and practical patterns for staying compliant under regimes like GDPR & DPDPA — without strangling the velocity that makes AI worth deploying in the first place.

  • Day 2 | HALL 2 - Practical Insights and Best Practices


  • Modern data stacks often promise a simple idea: a single query layer that can unify access across databases, data lakes, and SaaS sources. Engines like Trino make this vision technically feasible—allowing teams to query MySQL, MongoDB, S3-based lakehouses, and even spreadsheets through one interface. This talk shares a practitioner’s perspective on what it actually takes to make that promise work inside a regulated fintech environment, where data residency, auditability, and access control are not optional. Drawing from real-world experience building a federated query layer over heterogeneous systems, it explores the gap between architectural elegance and operational reality. The session will highlight challenges that emerge at scale: inconsistent semantics across sources, unpredictable query performance, governance complexities with role-based access, and the risk of turning federation into “distributed chaos.” It will also cover the practical guardrails required to make such a system usable—data contracts, curated layers, controlled abstractions, and user-focused enhancements like custom functions. Rather than presenting federation as a silver bullet, this talk reframes it as a powerful but disciplined capability. Attendees will leave with a clearer understanding of when a unified query layer accelerates data access—and when it quietly amplifies complexity.

  • In a world of always-on connectivity, telecom systems generate massive streams of real-time data every second. This session explores how modern data platforms process, analyze, and act on this data instantly, powering everything from network optimization to personalized customer experiences at scale

  • As data ecosystems evolve, building and operating reliable, maintainable, and scalable data pipelines becomes increasingly complex. This session introduces a modern shift in data engineering: a zero-code ETL platform where users define what the pipeline should do, and data engineers define how the platform should handle its execution at scale. It essentially abstracts pipeline complexities behind an intuitive UI and then standardised configurations. We extend this architecture with an LLM-powered segmentation layer on top of the data warehouse, turning raw data into actionable insights. It converts high-level user intent into SQL queries and downstream pipelines, allowing business users to run experiments on their own—without depending on engineering teams or facing bottlenecks. Zero-code ETL: users see magic, engineers maintain the illusion.

  • As organizations transition to AI-first models, building scalable and reliable data pipelines becomes critical. This session explores how real-time data platforms enable faster decision-making, seamless data flow, and robust AI deployment at scale.

  • Most startups bolt on privacy later. Designing systems where PII is isolated, tokenised, or abstracted early. This can include schema design, access control layers, and service boundaries.

  • DataOps gave us insights. DecisionOps makes them execute. The goal has changed from data to decisions & AI-native platforms are accelerating decision-making across the value chain. Let us hear about how is that advancing clinical research, transporting life saving medicines on time & saving lives!

  • Day 2 | Main Hall - Thought Leadership and Strategic Insights


  • Many data platforms fail to deliver impact due to fragmented data and lack of clear business alignment. This session focuses on what truly moves the needle, use-case-driven strategies, better data quality, and integrated ecosystems that deliver real outcomes.

  • Internal data platforms are becoming a core competency. The panel can explore how data engineering is borrowing from DevOps and platform engineering self-serve infrastructure, developer portals, golden paths and whether it's working in practice.

  • In the era of AI-driven decision-making, data quality has become more critical than ever. This session explores how poor-quality data can amplify errors at scale, leading to flawed insights and risky outcomes. It will highlight the importance of robust data governance, validation frameworks, and continuous monitoring to ensure AI systems are built on reliable, high-quality data.

  • What if your data pipeline could generate user intelligence before a single real user touches your product? This session introduces Synthetic Users - AI agent swarms that simulate real human behaviour at scale and the data pipeline architecture that transforms their interactions into actionable product intelligence in hours, not months.

  • As AI rapidly reshapes how organizations build intelligence, this session explores what it truly means to go beyond the algorithm. Drawing from real enterprise experience at one of India’s most iconic consumer brands, the talk traces the evolution from classical analytics and ML to multimodal GenAI and agent‑driven decision systems. Using the end‑to‑end product lifecycle as a lens from design and manufacturing to go‑to‑market and continuous intelligence - it demonstrates how AI delivers impact only when grounded in strong data foundations, business context, and governance. The session also reflects on the changing role of data scientists into AI orchestrators, highlighting why system thinking, judgement, and business storytelling matter more than ever in the age of AI.

  • As audio data grows exponentially, building scalable storage systems becomes critical for efficient search and retrieval. This session will explore architectures and strategies to handle high-volume audio data while ensuring performance, reliability, and cost efficiency.


Our Pricing will change soon!

  • Standard Pass

    Regular Passes to expire on 24th Apr 2026
  • All access, 2 day passes (Workshops not included)
  • Group Discount available
  • 15000
  • VIP Pass

    Everything in Workshop, plus:
  • Dedicated Whatsapp Support (before, during, and after the show)
  • VIP check-in
  • Exclusive Platinum Lounge Access - A lounge for VIP pass holders and Speakers only!
  • Priority Lunch area
  • Post event recordings
  • Goodies bag with Exclusive Merchandise
  • 1 Year Digital Subscription of AIM
  • 25000

1000+

Attendees

50+

Speakers

5th

Edition

explore the frontiers of Data engineering.

Focused on data engineering innovation, this 2-day conference will give attendees direct access to top leaders & innovators from leading tech companies who will talk about the software deployment architecture of AI systems, how to produce the latest data frameworks and solutions for business use cases.

Speaker at Data Engineering Summit 2026, sharing insights on data technology and innovation.
Award-winning Data Engineering Project at Summit 2026.

The Finkelstein Awards for Data Engineering Excellence 2026

Secure Your Seat at the Frontier of Data + engineering.

MAY. 14-15 2026
Bengaluru

Regular Passes are expiring in a Week!

Prices to increase from 24th Apr.