Data Engineering Summit 2023

India's first & only conference dedicated to the emerging field of Data Engineering


27 – 28th April 2023
Hotel Radisson Blu, Bengaluru


We are in the process of finalizing the schedule. Please check back this space again. Expect more than 50 speakers to speak at DES 2023. To explore speaking opportunities with DES, write to

Expand All +
  • Day 1 | HALL 2 - Practical Insights and Best Practices

  • Day 1 | Main Hall - Thought Leadership and Strategic Insights

  • Day 2 | HALL 2 - Practical Insights and Best Practices

  • The talk will focus on digital transformation key components, business agility and innovation, business innovation and three facets of digital transformation, five principles of business agility and the role of data engineering in digital transformation.

  • The ‘T’ in ETL or ELT Pipeline A tool to equip the business to build its own data marts using SQL/Dbt and create dashboards to drive insights much faster with almost no dependency on the data engineers. A Data Transformation Pipeline to transform the data, which is built using DBT(data build tool) running on serverless containers (AWS Fargate) using Airflow as a workflow orchestrator.

  • Probabilistic data structures are a type of statistical algorithm designed to optimize the use of memory in storing and querying large datasets. These structures employ probabilistic algorithms to estimate the presence of elements in a dataset with a high degree of accuracy while minimizing the amount of memory required for storage. In this session, we will explore some fundamental analytical questions that, when answered accurately for very large datasets, require substantial resources and cost. This is particularly relevant in streaming data use cases such as real-time monitoring, fraud detection, social media analytics, and online advertising, where the timely availability of analytics takes precedence over their accuracy.

  • By implementing a semantic data layer, new businesses can achieve several benefits: - Improved data quality: Faster time-to-insight: A semantic layer can help reduce the time it takes to extract insights from data by making it easier to access and use. - Greater agility. - Improved collaboration. Overall, a data semantic layer can be a valuable investment for new businesses that want to make the most of their data and gain a competitive advantage in their industry.

  • Day 2 | Main Hall - Thought Leadership and Strategic Insights

  • This talk will cover the challenges of dealing with mistakes, inconsistencies, and subjectivity in human-labeled datasets. In this talk, we will discuss how to build, use, and secure representative datasets for AI problems, taking a special attention to crowdsourced data and data obtained from in-house annotation teams. We will start with typical issues of crowdsourced and human-labeled datasets, such as annotator biases and differences in their backgrounds. Then, we will focus on the annotator disagreement problem and answer subjectivity problem. We will present business case studies of how these problem are addressed in practice, leading to the creation of useful training datasets. We will also discuss Web-scale dataset poisoning problems and the ways to ensure the sustainability of the once created dataset. Finally, we will tackle the problem of learning from such data, showing convenient open-source tools for improving machine learning model quality.

Grab your ticket for a unique experience of inspiration, meeting and networking for the Data engineering industry

Book your tickets at the earliest. We have a hard stop at 500 passes.
Note: Ticket Pricing to change at any time.

  • Early Bird Pass

    Available till 17th Feb 2023
  • All access, 2 day passes
  • Group Discount available
  • Late Pass

    Available from 8th Apr 2023 onwards
  • All access, 2 day passes
  • No Group Discount available
  • 4999

Regular Passes to Expire in 2 Weeks

Ticket Prices to Increase on 7th Apr