Lakehouse Days: Dec 2024

Want to see e6data in action?

Learn how data teams power their workloads.

Get Demo
Get Demo

About the event

Join us for an exclusive in-person event on “Apache Iceberg: understanding the internals, performance, and future" hosted by e6dataThis meetup is designed specifically for data engineers, data architects, and senior software engineers who are constantly looking to optimize their data architecture to make it more price-performant while delivering the best user experience. In this edition, we will deep-dive into the internal architecture of open table formats like Apache Iceberg, the recent announcement of AWS S3 tables for Apache Iceberg, streaming ingestion to Iceberg using a Rust-based solution, and how Apache Iceberg is being used at Netflix at scale. We aim to raise awareness about these open-table formats and gain a deeper understanding.Lakehouse Days is designed to enable fellow data nerds to meet, network, and have insightful discussions on the entropic world of data.

Meet the speakers

Sachin Tripathi, Senior Data Engineer at Bureau

Topic: Apache Iceberg 101: Understanding the Need for Lakehouses Over Data Lakes or Warehouses

This discussion covers key features such as time travel, schema evolution, hidden partitioning, and catalogs. It also offers insights into optimizing analytics, managing metadata, and ensuring interoperability across multi-engine ecosystems, highlighting their advantages.

Time: 9:00 - 9:45 AM IST

Soumil Shah, Sr. Software Engineer at Zeta Global

Topic: A take on AWS's recent announcement of the S3 table

In this session, Soumil will dissect and discuss AWS’s recent announcement of Amazon S3 Tables – a fully managed Apache Iceberg tables offering by AWS, optimized for analytics workloads.

Time: 10:00 - 10:45 AM IST

Vipul Bharat Marlecha, Senior Software Engineer, Netflix Ankur Ranjan, Senior Softwate Engineer at e6data

Topic: Open discussion on streaming ingestion & apache iceberg

In this session, Vipul and Ankur will engage in an open discussion to showcase how Apache Iceberg technology facilitates streaming ingestion, along with its advantages and disadvantages. They will also explore how Netflix leverages Apache Iceberg at scale, covering aspects like table maintenance, cataloging, streaming sources, and much more.

Time: 11:00 - 11:45 AM IST

Fenil Jain, Software Development Engineer at e6dataShreyas Mishra, Software Development Engineer at e6data

Topic: Streaming ingestion to Apache Iceberg using a rust-based solution

Apache Iceberg is an open-source high-performance format for huge analytic tables, which enables the use of SQL tables for big data while making it possible for engines like Spark, Trino, Flink, Presto, and e6data query engines. In this talk, we will re-imagine the streaming ingestion to Apache Iceberg using a rust-based solution instead of Apache Flink, Spark Structure streaming, or Kafka stream. Rust’s memory safety and concurrency features make it ideal for building efficient ingestion pipelines that can transform and write data directly into Iceberg’s table format. This ensures seamless integration, low-latency ingestion, and effective handling of schema evolution, enabling real-time analytics on fresh data.

Time: 12:00 - 12:45 PM IST


Read more about Apache Iceberg

Share on
Table of contents:

Subscribe to our newsletter - Data Engineering ACID

Get 3 weekly stories around data engineering at scale that the e6data team is reading.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Share this article

Lakehouse Days: Dec 2024

21st Dec 2024 from 8:45 AM to 1:30 PM IST
/
Bengaluru
Lakehouse Days

About the event

Join us for an exclusive in-person event on “Apache Iceberg: understanding the internals, performance, and future" hosted by e6dataThis meetup is designed specifically for data engineers, data architects, and senior software engineers who are constantly looking to optimize their data architecture to make it more price-performant while delivering the best user experience. In this edition, we will deep-dive into the internal architecture of open table formats like Apache Iceberg, the recent announcement of AWS S3 tables for Apache Iceberg, streaming ingestion to Iceberg using a Rust-based solution, and how Apache Iceberg is being used at Netflix at scale. We aim to raise awareness about these open-table formats and gain a deeper understanding.Lakehouse Days is designed to enable fellow data nerds to meet, network, and have insightful discussions on the entropic world of data.

Meet the speakers

Sachin Tripathi, Senior Data Engineer at Bureau

Topic: Apache Iceberg 101: Understanding the Need for Lakehouses Over Data Lakes or Warehouses

This discussion covers key features such as time travel, schema evolution, hidden partitioning, and catalogs. It also offers insights into optimizing analytics, managing metadata, and ensuring interoperability across multi-engine ecosystems, highlighting their advantages.

Time: 9:00 - 9:45 AM IST

Soumil Shah, Sr. Software Engineer at Zeta Global

Topic: A take on AWS's recent announcement of the S3 table

In this session, Soumil will dissect and discuss AWS’s recent announcement of Amazon S3 Tables – a fully managed Apache Iceberg tables offering by AWS, optimized for analytics workloads.

Time: 10:00 - 10:45 AM IST

Vipul Bharat Marlecha, Senior Software Engineer, Netflix Ankur Ranjan, Senior Softwate Engineer at e6data

Topic: Open discussion on streaming ingestion & apache iceberg

In this session, Vipul and Ankur will engage in an open discussion to showcase how Apache Iceberg technology facilitates streaming ingestion, along with its advantages and disadvantages. They will also explore how Netflix leverages Apache Iceberg at scale, covering aspects like table maintenance, cataloging, streaming sources, and much more.

Time: 11:00 - 11:45 AM IST

Fenil Jain, Software Development Engineer at e6dataShreyas Mishra, Software Development Engineer at e6data

Topic: Streaming ingestion to Apache Iceberg using a rust-based solution

Apache Iceberg is an open-source high-performance format for huge analytic tables, which enables the use of SQL tables for big data while making it possible for engines like Spark, Trino, Flink, Presto, and e6data query engines. In this talk, we will re-imagine the streaming ingestion to Apache Iceberg using a rust-based solution instead of Apache Flink, Spark Structure streaming, or Kafka stream. Rust’s memory safety and concurrency features make it ideal for building efficient ingestion pipelines that can transform and write data directly into Iceberg’s table format. This ensures seamless integration, low-latency ingestion, and effective handling of schema evolution, enabling real-time analytics on fresh data.

Time: 12:00 - 12:45 PM IST


Read more about Apache Iceberg

Related posts

View All Posts

Related posts

Lakehouse Days
21st Dec 2024 from 8:45 AM to 1:30 PM IST
/
Bengaluru

Lakehouse Days: Dec 2024

Sachin Tripathi — Senior Data Engineer, EarnIn
Soumil Shah — Senior Software Engineer, Zeta Global
Vipul Bharat Marlecha — Senior Software Engineer, Netflix
Ankur Ranjan — Senior Software Engineer, e6data
Fenil Jain — Software Development Engineer, e6data
Lakehouse Days
27th July 2024 from 8:30 AM to 12:30 PM IST
/
Bengaluru

Lakehouse Views: July 2024

Vivek Bansal — Senior Software Engineer, Uber
Sudarsan Lakshmi Narasimhan — Engineering Team, e6data
Faiz Kothari — Senior Engineering Team, e6data
Sagar Prajapati — Founder, Geekcoders
Vishnu Vasanth — Founder & CEO, e6data
Lakehouse Days
17th August 2024 from 8:30 AM to 12:30 PM IST
/
Bengaluru

Lakehouse Days: August 2024

Sagar Sumit — Apache Hudi PMC & Senior Software Engineer, OneHouse
Ashutosh Kumar — Staff Engineer, PayPal
Sudarsan Lakshmi Narasimhan — Engineering Team, e6data
Kiran Nunna — Engineering Team, e6data
Vishnu Vasanth — Founder & CEO, e6data
View All Posts