Lakehouse query engine

Built for querying high-concurrency, complex SQL & AI workloads in lakehouse

e6data’s decentralized, Kubernetes-native architecture granularly scales in 1-vCPU increments. Built for enterprises facing throttling, rationing, and vendor lock-in. Compatible with all table formats and catalogs.

10x

faster

60%

lower costs

zero

migration
Architecture diagram showing a standard lakehouse query engine workflow with multiple applications, table formats, governance solutions, data catalogs, deployed across clouds, regions, and on-premises.
W/o e6data

Lorem ipsum dolor sit amet consectetur.

Lorem ipsum dolor sit amet consectetur. Rhoncus pharetra amet praesent quis neque fermentum proin. Pretium viverra augue amet eget enim mi. Morbi pulvinar sit tellus feugiat.
Architecture diagram showing e6data’s lakehouse query engine workflow with multiple applications, table formats, governance solutions, data catalogs, deployed across clouds, regions, and on-premises, with zero migration from existing setup.
W/o e6data

Lorem ipsum dolor sit amet consectetur.

Lorem ipsum dolor sit amet consectetur. Rhoncus pharetra amet praesent quis neque fermentum proin. Pretium viverra augue amet eget enim mi. Morbi pulvinar sit tellus feugiat.
Benchmarks
Vs. legacy lakehouse engine

3.09x

Faster
TPC-DS
Delta
8 QPS
Vs. legacy QUERY engine

11.02x

Faster
TPC-DS
Fabric
30 cores
Query type: comparison

1.58x

Faster
TPC-DS
Delta
AWS
XS
Vs. legacy lakehouse engine

67.64%

Lower cost
TPC-DS
Delta
8QPS
Vs. legacy query engine

7.04x

Faster
TPC-DS
Iceberg
XS
Query type: logical

1.80x

Faster
TPC-DS
Delta
AWS
XS
Vs. legacy lakehouse engine

3.08x

Lower p99 latency
TPC-DS
Delta
8 QPS
e6data + Fabric

3081.2s

Execution time
TPCDS_1000
Delta
30 cores
e6data + Fabric

60.05%

Lower cost
TPC-DS
Fabric
30 cores
High Concurrency

1.20x

Faster
TPC-DS
Delta
AWS
XS
Vs. legacy lakehouse engine

3.09x

Faster
TPC-DS
Delta
8 QPS
Vs. legacy QUERY engine

11.02x

Faster
TPC-DS
Fabric
30 cores
Query type: comparison

1.58x

Faster
TPC-DS
Delta
AWS
XS
Vs. legacy lakehouse engine

67.64%

Lower cost
TPC-DS
Delta
8QPS
Vs. legacy query engine

7.04x

Faster
TPC-DS
Iceberg
XS
Query type: logical

1.80x

Faster
TPC-DS
Delta
AWS
XS
Vs. legacy lakehouse engine

3.08x

Lower p99 latency
TPC-DS
Delta
8 QPS
e6data + Fabric

3081.2s

Execution time
TPCDS_1000
Delta
30 cores
e6data + Fabric

60.05%

Lower cost
TPC-DS
Fabric
30 cores
High Concurrency

1.20x

Faster
TPC-DS
Delta
AWS
XS
Developer Experience

Query everything, scale and secure fast on your own stack

Run SQL + AI workloads that auto scale, block bad jobs, run vector search, and stay secure with row/column masking—no tuning, no trust issues.

Runs with your data stack

Lakehouse, table formats, catalogs, BI tools, and RAG apps—no custom glue code needed.
Lakehouse
Queries directly with zero data movement.
Table Format
Highly performant on all table formats.
Catalog
Plugs into any catalog; no rules rewrites.
Application
Connects to any BI, RAG app, chatbot tool
Governance
Governance ready: plug into your tools.

SQL meets AI, right in your lakehouse

Query structured and unstructured data with cosine similarity. No vector DBs. Just pure vector search.
SQL query block depicting the ability to do vector search on top of unstructured data from any object-storage (S3, GCS, ADLS) through SQL language.

Auto-scaling that adapts to query load

Set min and max, we handle the rest. Executors scale with load with no latency spikes, no job failures, no manual tuning.
Product screenshot depicting e6data’s autoscaling feature which scales up and down automatically to accommodate peak and fluctuating workloads.

Guardrails to stop “bad” queries early

Set thresholds per cluster. Log, alert, or cancel in real time before bad queries waste compute.
Product image depicting e6data’s query guardrails feature, which alerts users about any bad query (example with high number of table scans) to enable them to abort or redirect it and maintain SLAs.

Sub-second streaming of data in your lake

Stream directly to your lakehouse, query with sub-second latency- query with SQL/Python. No Flink, no ETL, no learning curve.
Streaming data sources like “Database CDC,” “Event Streams,” and “API and Logs”, get ingested into the e6data real-time streaming ingest, while maintaining sub-second freshness. The data then continues to land in the user’s lakehouse.

Enterprise-grade security and governance

Row/column-level control, IAM integration, and audit-ready logs. SOC 2, ISO, HIPAA, and GDPR—secure by design, with no slowdown.
Sample table listing first name, last name, and masked SSNs, overlaid with compliance badges for ISO, GDPR, HIPAA, and SOC 2, displaying e6data’s compliance and data governance.
Head of Platform Engineering
B2B observability SaaS
“We’ve been looking to move our logs to S3 since the costs became super high. With e6data, it became possible faster as our p95 & p99 latencies were maintained. All our logs now ingest & query in S3. ”
Use Cases

Run your most resource-intensive SQL and AI workloads

Get predictable SLAs, instant query responses, and radically lower compute costs—all with no query rewrites or app changes.

Packaged Analytics

Deliver embedded, multi-tenant analytics seamlessly within your SaaS applications. Gain 10x faster performance at scale while reducing infrastructure costs by up to 60% and operational complexity.

Interactive Analytics

Enable real-time dashboards and dynamic data exploration at massive scale. Deliver sub-2-second response times for 1000+ QPS with consistent SLAs and UX and without any latency.

Ad-hoc Analytics

Run complex ad-hoc queries 10x faster across diverse data sources (object storage, OLAP, data streams, and more) from a unified engine. Achieve zero-failed SLAs due to poorly optimized queries and resource constraints.

Scheduled Analytics

Run frequent, high-volume scheduled analytics with 99.99% reliability for scheduled workflows—without downtime, data delays, or compute cost overruns, even with rapid refresh cycles.


Real Time Ingest

Stream data into your lakehouse with sub-second latency. Skip Flink, ETL, and pipeline overhead. Query fresh events instantly using SQL or Python—no shuffle, no joins, no delay between ingestion and analysis.

Vector Search

Run semantic search on unstructured data using built-in cosine similarity. No vector DBs, no retrieval pipelines. Query text like structured rows with SQL—fast, scalable, and lakehouse-native for instant, AI-powered insights.

Packaged Analytics

Deliver embedded, multi-tenant analytics seamlessly within your SaaS applications. Gain 10x faster performance at scale while reducing infrastructure costs by up to 60% and operational complexity.

Interactive Analytics

Enable real-time dashboards and dynamic data exploration at massive scale. Deliver sub-2-second response times for 1000+ QPS with consistent SLAs and UX and without any latency.

Ad-hoc Analytics

Run complex ad-hoc queries 10x faster across diverse data sources (object storage, OLAP, data streams, and more) from a unified engine. Achieve zero-failed SLAs due to poorly optimized queries and resource constraints.

Scheduled Analytics

Run frequent, high-volume scheduled analytics with 99.99% reliability for scheduled workflows—without downtime, data delays, or compute cost overruns, even with rapid refresh cycles.


Real Time Ingest

Stream data into your lakehouse with sub-second latency. Skip Flink, ETL, and pipeline overhead. Query fresh events instantly using SQL or Python—no shuffle, no joins, no delay between ingestion and analysis.

Vector Search

Run semantic search on unstructured data using built-in cosine similarity. No vector DBs, no retrieval pipelines. Query text like structured rows with SQL—fast, scalable, and lakehouse-native for instant, AI-powered insights.

FAQs

Which workloads does the e6data Query Engine accelerate?
The engine is purpose-built for high-concurrency SQL and AI workloads in a lakehouse, covering interactive dashboards, ad-hoc queries, packaged and scheduled analytics—without data movement or query rewrites.
Can I run e6data alongside Snowflake or Databricks?
We integrate with your existing data architecture—whether you’re using Amazon Sagemaker, Databricks, Snowflake, Trino, Athena, or any other engine—alongside your chosen catalog, governance framework, table format, and BI tools. You can deploy us anywhere: single or multi-cloud, multi-region, on-premises, or in a hybrid environment.
How does e6data scale to match demand?
Its decentralized, Kubernetes-native architecture scales executors in 1-vCPU steps (or per core). Auto-scaling adjusts between user-set min and max limits with no latency spikes, job failures, or manual tuning.
Does the platform guard against inefficient queries?
Yes. Per-cluster thresholds can log, alert, or cancel bad queries in real time, stopping wasteful jobs before they consume excess compute.
What security and compliance features are included?
Enterprise-grade controls include row- and column-level masking, IAM integration, audit-ready logs, and readiness for SOC 2, ISO, HIPAA, and GDPR, all delivered with no slowdown.