Lakehouse Days: March 2025

Mar 8, 2025, from 10:00 AM to 2:00 PM IST

Bengaluru

Lakehouse Days

About the Event

Join us for an exclusive in-person event on “Apache Iceberg: Basics, optimizations, features, streaming data, query execution” hosted by e6data in Hyderabad!

Lakehouse Days - Powered by AWS is designed specifically for data engineers, data architects, and senior software engineers who constantly seek to optimize their data architecture to make it more price-performant while delivering the best user experience. In this edition, we will dive deep into the internal architecture of open table formats like Apache Iceberg, how Apache Kafka works, building a modern data platform that simultaneously queries streaming and analytical data on Iceberg, how Amazon S3 Tables delivers a fully managed Apache Iceberg experience to simplify large-scale analytics on Amazon S3, and how Arrow IPC enhances Apache Iceberg-based data lakes by accelerating streaming ingestion and query execution. We aim to raise awareness about these open-table formats and gain a deeper understanding.

Lakehouse Days - Powered by AWS is designed to enable fellow data geeks to meet, network, and have insightful discussions on the entropic world of data.

‍
Meet the Speakers

Diptiman Raichaudhuri, Staff Developer Advocate at Confluent

Topic: Streaming Data into a Lakehouse - Kafka Greets Iceberg

‍Summary: Join this session to learn how operational and analytical data estates are getting merged! Apache Kafka, the de-facto standard for real-time streaming data, can now materialize events in a Lakehouse(Iceberg/Delta Lake), and analytical queries can run on materialized Kafka topics. This session will start from the ground up on what Iceberg is, how Kafka works, and the community efforts behind two of the most important frameworks, Apache Kafka and Apache Iceberg, coming closer. The audience will learn how to build a modern data platform with streaming and analytical data simultaneously queried on Iceberg.

‍Time: 10:00 - 10:45 AM IST

‍
David John Chakram, Principal Architect at AWS

‍‍Topic: Amazon S3 Tables: Scaling Apache Iceberg for High-Performance Analytics‍

Summary: Traditional data lakes provide immense scalability but often face performance, consistency, and interoperability challenges. In this session, David guides you through how Open Table Formats (OTFs) like Apache Iceberg revolutionize how organizations store and process tabular data at scale. He’ll dive into Iceberg’s key features, advantages over traditional approaches, and how Amazon S3 Tables, AWS’s latest innovation, delivers a fully managed Apache Iceberg experience to simplify large-scale analytics on Amazon S3. The audience will learn how S3 Tables enhance query performance, reduce operational overhead, and empower businesses with seamless and high-performance analytics at scale.

Time: 11:00 - 11:45 AM IST

‍

Karthic Rao, Principal Engineer at e6data

Topic: Fast Distributed Iceberg Writes and Queries with Apache Arrow IPC

‍Summary: In modern distributed analytical systems, efficient data movement and processing are critical for performance. Apache Arrow’s Inter-Process Communication (IPC) framework provides a high-performance, language-agnostic columnar format that eliminates serialization overhead and optimizes in-memory analytics. This talk explores how Arrow IPC enhances Apache Iceberg-based data lakes by accelerating streaming ingestion and query execution. Karthic will highlight Arrow IPC’s zero-copy data sharing and high-speed transport via Arrow Flight, which streamlines data movement, and its vectorized computation capabilities, which align seamlessly with Iceberg’s columnar storage. Key applications include batching streaming data to mitigate the small files problem during ingestion and optimizing data shuffling and result delivery during queries. Through practical examples, He will demonstrate how Arrow IPC unifies fast writes and queries, delivering efficiency and scalability to Iceberg data platforms.

Time: 12:00 - 12:45 PM IST

‍

Register Now!‍

This is an exclusive and invite-only event. Please RSVP to reserve your spot through this link: https://lu.ma/ahuq2jqz?utm_source=website

Venue - Amazon Development Centre (HYD11), Nanakramguda

Date and time - Mar 8, 2025, from 10:00 AM to 2:00 PM

‍

Subscribe to our newsletter - Data Engineering ACID

Lakehouse Days: March 2025

About the Event

‍
Meet the Speakers

Register Now!‍

Read more about Apache Iceberg

Related posts

Related posts

Lakehouse Views: July 2024

Lakehouse Days: August 2024

Lakehouse Days: September Edition || Practice PySpark, SQL, and DSA problems with us

Subscribe to our newsletter - Data Engineering ACID

Lakehouse Days: March 2025

About the Event

‍Meet the Speakers

Register Now!‍

Read more about Apache Iceberg

Related posts

Related posts

Lakehouse Views: July 2024

Lakehouse Days: August 2024

Lakehouse Days: September Edition || Practice PySpark, SQL, and DSA problems with us

‍
Meet the Speakers