Lakehouse Days: March 2025

Want to see e6data in action?

Learn how data teams power their workloads.

Get Demo
Get Demo

About the Event

Join us for an exclusive in-person event on “Apache Iceberg: Basics, optimizations, features, streaming data, query execution” hosted by e6data in Hyderabad!

Lakehouse Days - Powered by AWS is designed specifically for data engineers, data architects, and senior software engineers who constantly seek to optimize their data architecture to make it more price-performant while delivering the best user experience. In this edition, we will dive deep into the internal architecture of open table formats like Apache Iceberg, how Apache Kafka works, building a modern data platform that simultaneously queries streaming and analytical data on Iceberg, how Amazon S3 Tables delivers a fully managed Apache Iceberg experience to simplify large-scale analytics on Amazon S3, and how Arrow IPC enhances Apache Iceberg-based data lakes by accelerating streaming ingestion and query execution. We aim to raise awareness about these open-table formats and gain a deeper understanding.

Lakehouse Days - Powered by AWS is designed to enable fellow data geeks to meet, network, and have insightful discussions on the entropic world of data.


Meet the Speakers

Diptiman Raichaudhuri, Staff Developer Advocate at Confluent

Topic: Streaming Data into a Lakehouse - Kafka Greets Iceberg

Summary: Join this session to learn how operational and analytical data estates are getting merged! Apache Kafka, the de-facto standard for real-time streaming data, can now materialize events in a Lakehouse(Iceberg/Delta Lake), and analytical queries can run on materialized Kafka topics. This session will start from the ground up on what Iceberg is, how Kafka works, and the community efforts behind two of the most important frameworks, Apache Kafka and Apache Iceberg, coming closer. The audience will learn how to build a modern data platform with streaming and analytical data simultaneously queried on Iceberg.

Time: 10:00 - 10:45 AM IST


David John Chakram, Principal Architect at AWS

Topic: Amazon S3 Tables: Scaling Apache Iceberg for High-Performance Analytics

Summary: Traditional data lakes provide immense scalability but often face performance, consistency, and interoperability challenges. In this session, David guides you through how Open Table Formats (OTFs) like Apache Iceberg revolutionize how organizations store and process tabular data at scale. He’ll dive into Iceberg’s key features, advantages over traditional approaches, and how Amazon S3 Tables, AWS’s latest innovation, delivers a fully managed Apache Iceberg experience to simplify large-scale analytics on Amazon S3. The audience will learn how S3 Tables enhance query performance, reduce operational overhead, and empower businesses with seamless and high-performance analytics at scale. 

Time: 11:00 - 11:45 AM IST

Karthic Rao, Principal Engineer at e6data

Topic: Fast Distributed Iceberg Writes and Queries with Apache Arrow IPC

Summary: In modern distributed analytical systems, efficient data movement and processing are critical for performance. Apache Arrow’s Inter-Process Communication (IPC) framework provides a high-performance, language-agnostic columnar format that eliminates serialization overhead and optimizes in-memory analytics. This talk explores how Arrow IPC enhances Apache Iceberg-based data lakes by accelerating streaming ingestion and query execution. Karthic will highlight Arrow IPC’s zero-copy data sharing and high-speed transport via Arrow Flight, which streamlines data movement, and its vectorized computation capabilities, which align seamlessly with Iceberg’s columnar storage. Key applications include batching streaming data to mitigate the small files problem during ingestion and optimizing data shuffling and result delivery during queries. Through practical examples, He will demonstrate how Arrow IPC unifies fast writes and queries, delivering efficiency and scalability to Iceberg data platforms.

Time: 12:00 - 12:45 PM IST

Register Now!

This is an exclusive and invite-only event. Please RSVP to reserve your spot through this link: https://lu.ma/ahuq2jqz?utm_source=website

Venue - Amazon Development Centre (HYD11), Nanakramguda

​Date and time - Mar 8, 2025, from 10:00 AM to 2:00 PM

Read more about Apache Iceberg

Share on
Table of contents:

Subscribe to our newsletter - Data Engineering ACID

Get 3 weekly stories around data engineering at scale that the e6data team is reading.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Share this article

Lakehouse Days: March 2025

Mar 8, 2025, from 10:00 AM to 2:00 PM IST
/
Bengaluru
Lakehouse Days

About the Event

Join us for an exclusive in-person event on “Apache Iceberg: Basics, optimizations, features, streaming data, query execution” hosted by e6data in Hyderabad!

Lakehouse Days - Powered by AWS is designed specifically for data engineers, data architects, and senior software engineers who constantly seek to optimize their data architecture to make it more price-performant while delivering the best user experience. In this edition, we will dive deep into the internal architecture of open table formats like Apache Iceberg, how Apache Kafka works, building a modern data platform that simultaneously queries streaming and analytical data on Iceberg, how Amazon S3 Tables delivers a fully managed Apache Iceberg experience to simplify large-scale analytics on Amazon S3, and how Arrow IPC enhances Apache Iceberg-based data lakes by accelerating streaming ingestion and query execution. We aim to raise awareness about these open-table formats and gain a deeper understanding.

Lakehouse Days - Powered by AWS is designed to enable fellow data geeks to meet, network, and have insightful discussions on the entropic world of data.


Meet the Speakers

Diptiman Raichaudhuri, Staff Developer Advocate at Confluent

Topic: Streaming Data into a Lakehouse - Kafka Greets Iceberg

Summary: Join this session to learn how operational and analytical data estates are getting merged! Apache Kafka, the de-facto standard for real-time streaming data, can now materialize events in a Lakehouse(Iceberg/Delta Lake), and analytical queries can run on materialized Kafka topics. This session will start from the ground up on what Iceberg is, how Kafka works, and the community efforts behind two of the most important frameworks, Apache Kafka and Apache Iceberg, coming closer. The audience will learn how to build a modern data platform with streaming and analytical data simultaneously queried on Iceberg.

Time: 10:00 - 10:45 AM IST


David John Chakram, Principal Architect at AWS

Topic: Amazon S3 Tables: Scaling Apache Iceberg for High-Performance Analytics

Summary: Traditional data lakes provide immense scalability but often face performance, consistency, and interoperability challenges. In this session, David guides you through how Open Table Formats (OTFs) like Apache Iceberg revolutionize how organizations store and process tabular data at scale. He’ll dive into Iceberg’s key features, advantages over traditional approaches, and how Amazon S3 Tables, AWS’s latest innovation, delivers a fully managed Apache Iceberg experience to simplify large-scale analytics on Amazon S3. The audience will learn how S3 Tables enhance query performance, reduce operational overhead, and empower businesses with seamless and high-performance analytics at scale. 

Time: 11:00 - 11:45 AM IST

Karthic Rao, Principal Engineer at e6data

Topic: Fast Distributed Iceberg Writes and Queries with Apache Arrow IPC

Summary: In modern distributed analytical systems, efficient data movement and processing are critical for performance. Apache Arrow’s Inter-Process Communication (IPC) framework provides a high-performance, language-agnostic columnar format that eliminates serialization overhead and optimizes in-memory analytics. This talk explores how Arrow IPC enhances Apache Iceberg-based data lakes by accelerating streaming ingestion and query execution. Karthic will highlight Arrow IPC’s zero-copy data sharing and high-speed transport via Arrow Flight, which streamlines data movement, and its vectorized computation capabilities, which align seamlessly with Iceberg’s columnar storage. Key applications include batching streaming data to mitigate the small files problem during ingestion and optimizing data shuffling and result delivery during queries. Through practical examples, He will demonstrate how Arrow IPC unifies fast writes and queries, delivering efficiency and scalability to Iceberg data platforms.

Time: 12:00 - 12:45 PM IST

Register Now!

This is an exclusive and invite-only event. Please RSVP to reserve your spot through this link: https://lu.ma/ahuq2jqz?utm_source=website

Venue - Amazon Development Centre (HYD11), Nanakramguda

​Date and time - Mar 8, 2025, from 10:00 AM to 2:00 PM

Read more about Apache Iceberg

Related posts

View All Posts

Related posts

Lakehouse Days
21st Dec 2024 from 8:45 AM to 1:30 PM IST
/
Bengaluru

Lakehouse Days: Dec 2024

Sachin Tripathi — Senior Data Engineer, EarnIn
Soumil Shah — Senior Software Engineer, Zeta Global
Vipul Bharat Marlecha — Senior Software Engineer, Netflix
Ankur Ranjan — Senior Software Engineer, e6data
Fenil Jain — Software Development Engineer, e6data
Lakehouse Days
27th July 2024 from 8:30 AM to 12:30 PM IST
/
Bengaluru

Lakehouse Views: July 2024

Vivek Bansal — Senior Software Engineer, Uber
Sudarsan Lakshmi Narasimhan — Engineering Team, e6data
Faiz Kothari — Senior Engineering Team, e6data
Sagar Prajapati — Founder, Geekcoders
Vishnu Vasanth — Founder & CEO, e6data
Lakehouse Days
17th August 2024 from 8:30 AM to 12:30 PM IST
/
Bengaluru

Lakehouse Days: August 2024

Sagar Sumit — Apache Hudi PMC & Senior Software Engineer, OneHouse
Ashutosh Kumar — Staff Engineer, PayPal
Sudarsan Lakshmi Narasimhan — Engineering Team, e6data
Kiran Nunna — Engineering Team, e6data
Vishnu Vasanth — Founder & CEO, e6data
View All Posts