How a US-based Global Bank Used a Hybrid Lakehouse to Save Egress Costs by 95%

“We have been impressed with e6data’s locality-aware architecture and its ability to handle our multi-region workloads with low cost, while maintaining the SLAs.”

– VP of Data Engineering, Global Bank

1.0s

p95 query latency

95%

Lower egress costs

Zero

Policy compliance issues

A leading global bank, headquartered in the US, serving 90 million+ customers across 150+ countries.

Highlighted e6data features

  • Hybrid lakehouse
  • Location-aware execution
  • Zero-ACL drift

Use Case

Hybrid lakehouse

Environment

AWS, Delta, Unity catalog

Industry

BFSI

Talk to an expert

Background

Managing petabytes of transactional and customer data spread across multiple cloud providers and on-premises data centers is hard, especially for global companies that operate across dozens of countries and regulatory jurisdictions. One of the leading global banks, headquartered in the US, which serves 90 million+ customers across 150+ countries, faced a similar compliance challenge. The bank’s data teams need to analyze global datasets (e.g., for risk and regulatory reports) without violating data residency laws or sacrificing speed.

Traditionally, each region ran its own analytics stack, making it hard to get a unified view. To best serve its global operations, the bank piloted with e6data’s hybrid lakehouse platform, which delivers analytics across multi-cloud and multi-region, without driving up egress costs, network costs, or operational complexity.

“We have been impressed with e6data’s locality-aware architecture and its ability to handle our multi-region workloads with low cost, while maintaining the SLAs.”
– VP of Data Engineering, Global Bank

Challenge – Multi-Region Data Governance & Latency Issues

The bank faced an acute challenge: data was siloed in multiple regions (AWS in the US, Azure in Europe, and on-premises storage in Asia) due to data residency and privacy regulations. 

  • Latency & SLA failures – Cross-region queries were painfully slow (often timing out) and incurred high cloud egress fees. p95 latencies for multi-region joins averaged to ~5 s, thus failing to meet the <1 s SLA requirements demanded by downstream trading desks and dashboards.

  • Exploding egress bills – Every month, this team was moving over 50 TB of data across regions at a cost of $0.02/GB.

  • Operational overhead – The team was under the heat with 6+ Spark ETL pipelines, 4+ IAM/KMS stacks, and 480 engineering hours/month gone just to keep all the data in sync.

  • Governance drift – To add to everyone’s worry, quarterly audits flagged ~30 % of tables as out-of-policy because copies aged out of sync with their Unity catalog.
“Every morning, we shipped TBs of data across regions just so a set of queries could run. We were paying for the copies and the time.”
- Head of Risk Analytics, U.S. Global Bank 

The bank needed a solution that could query all their data with low latency, minimize data movement, and enforce one set of governance policies everywhere.

Solution – e6data’s Hybrid Lakehouse Architecture

To address these challenges, the bank deployed e6data’s hybrid lakehouse platform on top of their existing data infrastructure to expand their capabilities across regions and on-prem. This solution introduced three core capabilities:

  • Locality-Aware Execution: With e6data, a query that joins U.S. and European datasets executes partial plans within the U.S. cluster and within the EU cluster, then merges results, ensuring that only minimal aggregated data crosses regions. This locality-aware design guaranteed strict control over network egress, performance, and scalability.

  • “Zero-Copy” Query Pushdown: e6data connected directly to the bank’s data lakes (Amazon S3 buckets, Azure Blob storage, and on-prem S3-compatible stores) – to query without any ETL. This not only reduced data processing overhead and cost, but also meant the bank’s data engineers did not have to rewrite any SQL or re-architect pipelines.

  • Unity Catalog–Based Governance: A critical factor for the bank was governance through Unity Catalog. e6data extended it across hybrid environments for consistent policy enforcement and query execution, regardless of data location. This unified governance model eliminated “drift” in policies between platforms. All actions are logged centrally, satisfying the bank’s audit requirements. 

Together, these features allowed the bank to query data as if it were in one lakehouse, without the usual costs of moving or duplicating data.

Results – 6-week pilot and beyond

After a 6-week pilot, the bank saw transformative improvements in both technical performance and operational efficiency. The table below summarizes the key before-and-after metrics:


Metric Before e6data After e6data Improvement

Cross-region data egress (per month)

~50 TB transferred between regions ~8 TB transferred between regions –84% (reduction)
Network egress cost per query ~$0.10 in cross-cloud fees <$0.005 (almost zero) –95–99% (lower cost)
95th percentile query latency (cross-region) ~5.0 seconds ~1.0 second –80% (faster)
Data engineering hours on pipeline & governance (monthly) ~120 hours (manual work) ~20 hours (mostly automated) –83% (fewer hours)
Policy compliance exceptions (per audit) ~6 issues flagged each audit

0 issues (no violations)

Eliminated

“So long as e6data kept our data where it belonged and still met our speed SLAs, it became a successful pilot. We now have one unified lakehouse serving all regions – without maintaining countless data copies or custom pipelines.”
— Engineering Manager, Global Bank