DES is a property of AIM Media House.

SCHEDULE

MAY. 15-16 2025
Hotel Radisson Blu, Bengaluru

MAY. 15-16 2025
Bengaluru

SCHEDULE

We are in the process of finalizing the schedule for 2025. Please check back this space again.

Expect more than 60 speakers to speak at DES 2025. To explore speaking opportunities with DES, write to info@aimmediahouse.com

Expand All +
  • Day 1 | HALL 2 - Practical Insights and Best Practices


  • As data-driven operations scale, data modeling has become increasingly complex and fast-paced, often clashing with the demands of agile delivery. This session explores how Generative AI is tackling long-standing challenges such as source data understanding and gaps in data catalogs. It highlights how GenAI is automating tasks like data profiling, metadata generation, schema design, and mapping, unlocking new levels of speed and accuracy. Real-world examples from Tiger Analytics illustrate the impact, along with a perspective on the evolving role of GenAI as a co-pilot in building scalable, business-aligned data architectures.

  • Data Engineering and Technical Product Capabilities at AMEX empowering critical business and risk decisioning at AMEX.

  • Data engineers play a pivotal role in driving contextualized intelligence. They must ensure that data is enriched with the proper context to deliver meaningful insights. This tech talk will explore the practical steps and tools data engineers can use to build a contextual layer, ensuring AI models can deliver strategic insights based on relevant business contexts.

  • In this talk, I will dive into a real-time data pipeline solution I worked on for the Travel & Hospitality domain, powered by Apache Flink Stateful Functions. This solution efficiently handled over 800,000 streaming events per day, addressing critical use cases. I will share insights into the challenges faced, the strategies used to overcome them, and how we successfully scaled the pipeline to meet demanding business requirements. For the second part of the session, I will explore the hot topic: Will Low-Code ETL Replace Data Engineers? I firmly believe that Low-Code ETL is not here to replace data engineers but to accelerate the development process. However, for this to succeed, ETL platforms must empower data engineers with the freedom and control to customize and adapt the platform to their specific needs. I’ll discuss examples like Prophecy, a Low-Code ETL platform that strikes a balance by acting as an IDE, generating open-source code, and keeping implementation transparent. This approach ensures that while development is expedited, the power remains firmly in the hands of data engineers.

  • In a world overflowing with data, the true challenge lies not in access—but in articulation. This session explores the craft of transforming raw, noisy data into compelling, credible, and impactful narratives. From choosing the right metrics to visual storytelling and context-driven publishing, we’ll unpack techniques that help data professionals elevate insight over information. Whether you're building dashboards, whitepapers, or public datasets, discover how to publish data that informs, engages, and drives action.

  • AI is reshaping the landscape of data engineering in transformative ways. One key trend is the rise of AI-powered data pipelines, where machine learning automates tasks like data cleaning, transformation, and feature engineering, significantly accelerating development cycles. Alongside this, AI-driven data observability tools are improving data reliability by detecting anomalies and quality issues in real time. Intelligent data catalogs are also evolving, using AI to automatically generate metadata, track lineage, and simplify data discovery. Natural language interfaces, such as Text-to-SQL systems powered by large language models, are making it easier for non-technical users to query databases using plain English. ETL and ELT processes are becoming smarter too, with AI automating schema mapping and transformation logic, reducing the need for manual coding. In parallel, synthetic data generation is gaining traction, enabling the creation of realistic datasets for testing, training, and privacy-sensitive use cases. Real-time data processing is becoming more intelligent as AI models integrate directly into streaming frameworks like Kafka and Spark, supporting use cases like fraud detection and personalization. Governance is also being augmented by AI, with tools now capable of automatically detecting sensitive data like PII and enforcing compliance with regulations.

  • In his session, Koushik will reveal how organizations can transform from passive data collectors to strategic data orchestrators, unlocking unprecedented business growth and competitive advantage. Drawing from his learnings of proven frameworks being adopted by industry-leading financial institutions and companies, he will outline the foundational pillars necessary to build a high-impact data strategy—starting with organizational alignment, where data teams are integrated into business units and held accountable for P&L outcomes. Koushik will also highlight the role of experimentation, scalable infrastructure, and predictive analytics in optimizing every stage of the customer journey and the common pitfalls like misaligned KPIs, siloed teams, and resistance to change—offering practical solutions to overcome them. The session will deliver an overall theme that winning with data is not about tools alone; it’s about aligning people, processes, and purpose.

  • The Unified Data platform is designed to address key limitations in data management for machine learning (ML) model development, training, and testing. Its primary focus is to improve turnaround time (TAT), enhance experimentation velocity, and provide a centralized solution for data needs.

  • Discover the engineering innovations behind KnowBe4's agentic platform, which utilizes Data and Generative AI to empower cybersecurity defense. Learn how KnowBe4 developed AIDA (Artificial Intelligence Defense Agents) to significantly reduce administrative overhead for customers by automating the configuration of KnowBe4 products. AIDA intelligently understands specific customer requirements based on user behavior, leading to automated risk reduction. This session will delve into the cutting-edge technologies that power this feature, including the latest advancements in Artificial Intelligence, Datamesh architecture, and big data engineering tools. We will explore how these elements are integrated to deliver current GenAI capabilities and lay the foundation for future intelligent features aimed at proactively mitigating human-related cyber risks.

  • Building a future-forward health insurance data platform focuses on leveraging emerging technologies like AI, machine learning, and blockchain to revolutionize data management in the insurance sector. By integrating diverse data sources such as medical records, wearable devices, and claims data, the platform enables personalized policies, faster claims processing, and enhanced risk management. This approach not only increases operational efficiency but also improves customer satisfaction by providing tailored coverage and real-time insights

  • Day 1 | Main Hall - Thought Leadership and Strategic Insights


  • Over the last couple of years, the capabilities of large language models (LLMs) and Generative AI have evolved at a breakneck pace, with enterprise adoption accelerating rapidly. These technologies are not only driving automation but are fundamentally challenging traditional approaches to how data is engineered, managed, and consumed. We are entering a new paradigm—one where emerging ways of working are redefining what’s possible across the enterprise data ecosystem. Join us for this Keynote as we explore what this transformation means for data engineers and business leaders. We'll examine real-world use cases, the evolving skill sets needed to thrive, and how teams can adapt to stay ahead in a rapidly shifting landscape.

  • In an era where financial decisions hinge on millisecond precision, delivering real-time price streams to end-users is no longer a luxury — it's a necessity. This talk explores modern best practices in building low-latency, highly reliable real-time data pipelines using technologies like WebSockets and NATS. We’ll dive into the architectural patterns that enable microservices to stream live pricing data seamlessly, and how tracking end-to-end latencies — from client browsers to backend systems — is both crucial and challenging. Learn how these real-time streams not only power live charts and dashboards but also feed AI systems that detect anomalies, forecast trends, and trigger intelligent actions. If you're building platforms where speed, scale, and accuracy are non-negotiable, this session will offer practical insights and proven techniques to get it right.

  • In today's data-driven world, context is everything. Without context, raw data can often lead to inaccurate insights, misinterpretations, and missed opportunities. For data engineers, this means building platforms that enrich data with relevant business context and provide strategic insights. Join this session to explore how modern, context-centric platforms help businesses stay ahead in the data-driven landscape.

  • As data volumes surge, traditional data engineering struggles with complexity, cost, and agility. This keynote introduces a paradigm shift: agentic systems — AI-native agents that autonomously plan, adapt, and collaborate across the data lifecycle. These systems transform ingestion, quality, pipeline design, and observability, freeing engineers to focus on design and value, not manual tasks. We explore today’s DataOps limitations, define agentic capabilities, showcase real-world use cases, and examine shifts in skills, teams, and ethics. For CDOs, platform leaders, and architects, this session offers a bold vision: Data Engineering reimagined through intelligent agents that deliver faster, smarter, and more scalable business outcomes.

  • The data landscape is transforming with generative AI driving innovations in pipeline architecture, real-time stream processing, and modern ETL practices. Emphasis on data trust, quality, and scalable monetization, alongside advancements in schema design and access layers, is unlocking new opportunities and efficiencies for businesses.

  • Your CEO has demanded that you leverage Generative AI technologies to lower costs and find even more insights out of your existing data sources. Every webinar you attend and every blog post you read worries you about problems such as hallucinations and gives you nightmares as you think about security, privacy, compliance, and governance. You imagine your entire workforce needs to be reshaped, and all of the time, effort, and money you've spent up until now, has netted you a pile of technical debt. Worse yet, because senior leadership sees Gen AI as a bit of "magic", you don't have clearly-defined and measurable business objectives. You are surely going to fail!! Time to take a deep breath... If there is hope for successfully adopting Gen AI technologies into your enterprise... it is in the data platforms you already have (or at least, are already available). This talk explains the foundational problems of Gen AI applicability for enterprises are focused on data access, collaboration, and governance. Data platforms that tackle these concerns exist today and are likely already deployed, or upgradeable, in your IT arsenal. No need to throw out what you have already built; it is time to architect around your solid foundation as you charge bravely into a new adventure

  • AI-driven automation is rapidly transforming data engineering workflows, from automated pipeline generation to self-healing data architectures. But does this mean traditional data engineering roles will become obsolete? This panel will debate whether AI is an enabler or a disruptor, discussing where human expertise remains irreplaceable and how engineers can adapt to stay relevant in the AI-powered future.

  • The current technological landscape underscores a significant convergence, where functionalities traditionally associated with separate domains of data management and AI are now unified within intelligent data platforms. This integration marks a fundamental shift in perspective, moving away from the concept of AI as merely an application layer to AI-integrated data infrastructure. This amalgamation facilitates a more streamlined and effective utilization of data for AI initiatives, fastens the pace of iteration in model development and ensures a tighter alignment between an organization's overall data strategy and AI vision. Furthermore, the emphasis placed on the delivery of "actionable insights" and the facilitation of "powerful decision-making" highlights the primary business-oriented purpose of these platforms. The evolution of data platforms reflects an imperative to not only manage the exponentially growing data volume but also derive intelligence that can inform and drive strategic decisions. Intelligent data platforms are engineered explicitly with this objective at their core, leveraging AI to automate insight and offer recommendations for accelerated organizational innovation.

  • As organizations race to adopt Generative AI, a strong and scalable data foundation is essential. This session explores how building a FAIR-aligned Data Marketplace has transformed data access, quality, and governance- paving the way for AI-driven innovation. Learn how a persona-driven, self-service platform powered by Snowflake, Power BI, and advanced metadata cataloging accelerated decision-making, eliminated redundancy, and boosted adoption by 25%. Key innovations include a real-time Data Quality Index, seamless integration for both technical and non-technical users, and smart search capabilities using LLMs. Discover how strategic leadership, automation, and a culture of collaboration are driving enterprise-wide data democratization- preparing the organization to fully harness the power of Generative AI.

  • In modern systems, real-time insights aren't a luxury—they're a requirement. Whether you're debugging distributed systems, tracking financial transactions, or analyzing user behavior, sub-second query latency can be the difference between reacting and proactively optimizing. This talk dives into the technical foundations that make real-time analytics possible at scale, using ClickHouse as the case study. We'll explore the architectural underpinnings of its high-performance columnar engine, including vectorized execution, late materialization, and how it handles time series and semi-structured data like JSON with minimal overhead. Through real-world use cases—from high-throughput log ingestion in observability stacks, to petabyte-scale analytics for adtech, fintech, and user personalization—you’ll see how engineering teams are designing for low latency without sacrificing flexibility or scale. If you're a data engineer, backend developer, or platform architect working with high-velocity data, this session will give you a deeper understanding of how to build infrastructure that can keep up.

  • As AI evolves towards autonomous agents, integrating real-time data becomes crucial. This session explores how StarTree’s support for the Model Context Protocol (MCP) and native vector auto-embedding empowers AI agents with live, structured data access. Attendees will gain insights into building scalable pipelines that combine streaming ingestion (Kafka), real-time analytics (Apache Pinot), and AI models. A live demonstration will showcase the integration of Kafka → Pinot → AI agent, highlighting the architecture that enables real-time decision-making with contextually relevant information. This talk emphasizes practical takeaways for architecting systems that seamlessly blend AI and analytics technologies.

  • Day 2 | HALL 2 - Practical Insights and Best Practices


  • In the era of Responsible AI, the ability to build scalable and efficient data engineering pipelines is critical—especially in data-sensitive domains like Insurtech. This session will explore how to architect robust pipelines that handle personalized data at scale while ensuring compliance, transparency, and fairness. We’ll delve into practical approaches for integrating Responsible AI principles into data workflows, tackling challenges like data privacy, bias mitigation, and auditability. Attendees will gain insights into real-world use cases and design patterns that enable secure, scalable, and ethical AI applications in the insurance space.

  • Discover how redBus built a smart data platform component that transforms raw user interactions into queryable Parquet files on S3. Learn how its auto-schema detection and evolution capabilities enable efficient debugging, analytics, and model training at scale.

  • In today’s data-driven financial ecosystem, speed, scale, and accuracy are non-negotiable. This session explores the key principles and real-world practices behind building scalable data pipelines that power meaningful financial insights. From ingesting high-volume transactional data to ensuring real-time processing and analytics, we'll dive into architectural patterns, technology choices, and performance considerations that enable organizations to unlock value from their data. Whether you're modernizing legacy systems or designing from the ground up, this talk will equip you with practical takeaways to build resilient, future-ready data pipelines tailored for the dynamic demands of the financial domain.

  • In today’s data-driven world, ensuring resilience in data architectures is crucial for maintaining seamless operations, minimizing downtime, and enabling real-time decision-making. This session explores the key principles of designing fault-tolerant, scalable, and adaptive data ecosystems that can withstand failures and disruptions. Key focus areas include strategies for high availability, redundancy, disaster recovery planning, and leveraging distributed computing and cloud-native technologies to enhance resilience. Real-world insights will showcase how resilient data architectures can optimize performance, improve reliability, and support business continuity in an evolving technological landscape

  • In this session, Prakash Patidar will explore the architecture of next-gen data platforms that are unified, reliable, scalable, and AI-driven. He will discuss how real-time data processing and AI technologies are transforming data systems, enabling smarter decision-making at scale. With a focus on automation, data quality, and real-time insights, this talk will showcase how businesses can leverage AI to unlock the true potential of their data, especially in the fast-evolving world of crypto and DeF.

  • Real-time data pipelines enable organizations to ingest, process, and act on streaming data with minimal latency, ensuring they can make informed decisions instantly. This session will explore the core components of real-time data pipelines, including event-driven architectures, stream processing frameworks, and scalable data ingestion technologies. We’ll discuss real-world applications such as fraud detection, predictive analytics, IoT monitoring, and personalized customer experiences. By the end of the session, attendees will gain a deeper understanding of how real-time data pipelines drive agility, enhance operational efficiency, and unlock new business opportunities. Whether you're a data engineer, architect, or business leader, this talk will equip you with the knowledge to harness the power of real-time data for smarter decision-making.

  • This session is to outline strategic data engineering practices that drive innovation and business growth. Emphasizing scalability, data quality, and operational efficiency. The session highlights how modern data platforms, automation, and governance empower data-driven decision-making across the organization.

  • As organizations increasingly rely on complex data ecosystems, maintaining data quality and trust has become paramount to business success. This discussion explores the critical components and best practices for implementing robust data quality frameworks within modern data architectures. We examine how organizations can establish automated testing, monitoring, and governance processes to ensure data reliability across the entire pipeline - from ingestion to consumption. Special attention is given to emerging tools and methodologies that enable continuous data quality assessment while scaling with growing data volumes and complexity.

  • Day 2 | Main Hall - Thought Leadership and Strategic Insights


  • In an era where data is abundant but insight is scarce, the true challenge lies not just in collecting information, but in empowering it to act. "Making Data Think" explores how organizations can move beyond passive dashboards to intelligent systems that reason, adapt, and predict. This session delves into the intersection of data engineering, AI, and decision intelligence — showcasing how contextual awareness, machine learning, and real-time feedback loops are transforming raw data into active, thinking partners in innovation. Whether you're a data leader or a curious technologist, this talk will leave you reimagining the very role of data in the enterprise.

  • This session explores how AI agents are transforming data engineering by automating complex workflows such as data ingestion, transformation, and pipeline orchestration. With real-time analytics and intelligent decision-making becoming critical, AI-driven automation is enabling greater efficiency, scalability, and accuracy in data processes. From automated anomaly detection to schema evolution and performance optimization, discover practical use cases that showcase the power of AI in simplifying and future-proofing data engineering strategies. Ideal for data engineers, architects, and AI enthusiasts, this talk offers insights into leveraging AI agents to reduce operational overhead and stay ahead in the era of intelligent automation.

  • Aniruddha (Ani) Ray is the Senior Vice President & the global technology lead for Agentic AI. He is also the global lead for Genpact’s Products and Platforms. He has been with Genpact for 8 months. His primary role is to create business impact and design the agentic architectures for our productized solutions to pivot to the “Services as Agentic Software” paradigm. He is currently leading the charge in rolling out the Genpact AP Suite while creating the Genpact Agentic Factory for the future. Ani has over 23+ years of experience in leading business and technology architecture to build and scale up products across companies like Accenture, EMC, IBM and GE. He started as a Data Engineer, evolved to become a Technology and Digital (Data, AI, Cloud) Architect and finally a technology strategy and innovation leader harnessing value for customers at the crossroads of business strategy and technology evolution. Ani has an MBA in Strategy from IIM Ahmedabad and an M. Tech from IIIT Bangalore. He holds over 10 Global Patents and in the last 10 Years has done more than 25+ Tech Certifications across Architecture and Engineering (Cloud, AI, ML, Data, Analytics) across all major Cloud Hyper-scalers and leading Data & AI Vendors.

  • Developing reliable AI models requires high-quality data. Due to the shortcomings of traditional methods, there is a need to adopt data reliability engineering. This approach involves applying principles from manufacturing and considering systems as data factories. It entails significant adjustments to people, processes, and tools. - Data Testing: Validates data through migration testing, pipeline certification, and big data reconciliation. - Data Monitoring: Tracks data pipeline health using business rules, and handles exceptions in real time. - Data Observability: Uses AI/ML for anomaly detection, compliance measurement, and defect rate prediction. The talk will highlight these pillars—Data Testing, Data Monitoring, and Data Observability—showcasing their importance in preventing errors, ensuring compliance, and optimizing AI performance. Case studies and industry practices will demonstrate how to engineer quality data effectively.

  • The AI Hackathon on the Government e-Marketplace (GeM) brought together innovators to solve real-world public sector challenges using cutting-edge technologies—Generative AI, Deep Learning, and Machine Learning. This session showcases standout use cases that emerged from the hackathon, demonstrating how AI can enhance transparency, efficiency, and user experience on GeM. At the heart of these innovations lies a solid Data Engineering foundation—powering everything from data ingestion and transformation to model training and deployment. Without it, these AI solutions wouldn’t scale or succeed. This talk will highlight how robust data engineering practices made the leap from idea to impact possible.

  • The concept of data as a product is gaining traction, promising improved accessibility, ownership, and value creation. But is this approach practical for all organizations, or is it just another theoretical framework? This discussion will explore the real-world adoption of data product thinking, its impact on governance and analytics, and whether businesses are truly structured to make the most of it.

  • As organizations scale to serve the "nextillions" – the next billion+ users entering the digital ecosystem – data becomes both a powerful enabler and a complex challenge. This session explores how we can humanize big data by bridging the gap between massive datasets and meaningful customer experiences. From hyper-personalization and intuitive decision-making to building trust through data ethics and compliance, the talk will highlight how businesses can move beyond numbers to truly understand and serve the diverse needs of emerging digital users. Real-world insights and case studies will demonstrate how human-centric data platforms, scalable architectures, and inclusive design principles are shaping the future of customer engagement at scale.

  • Addressing privacy, fairness, and bias while designing law bots for legal assistance.

  • In an era where the pharmaceutical industry is increasingly reliant on complex digital infrastructures and sensitive data, the threat landscape has grown both in scale and sophistication. This session will explore how AI-driven threat detection systems can be engineered to meet the unique challenges of the pharma sector—ranging from protecting intellectual property to ensuring patient data integrity and compliance with stringent regulations. Drawing from real-world implementations, the talk will delve into scalable architectures, real-time anomaly detection, and the fusion of domain knowledge with machine learning to proactively mitigate cyber threats at an enterprise scale.

  • In 1999, Anil Kumble etched his name in cricket history by taking 10 wickets in a single innings. But the story isn't just about stats — it’s about focus, perseverance, and alignment. In this reflective conversation, Kumble shares what it means to stay consistent across decades, how to reinvent yourself through data, and why long-term thinking trumps short-term wins — be it in sports, leadership, or engineering modern data foundations.


Our Pricing will change soon!


1000+

Attendees

50+

Speakers

4th

Edition

explore the frontiers of Data engineering.

Focused on data engineering innovation, this 2-day conference will give attendees direct access to top leaders & innovators from leading tech companies who will talk about the software deployment architecture of AI systems, how to produce the latest data frameworks and solutions for business use cases.

The Finkelstein Awards for Data Engineering Excellence 2025

Secure Your Seat at the Frontier of Data + engineering.

Early Bird Passes on sale now.

MAY. 15-16 2025
Bengaluru