Starburst
APIStarburst is an enterprise data analytics platform built on Trino for fast distributed SQL queries across any data sourc
www.starburst.ioLast updated: April 2026
Starburst is an enterprise data analytics platform built on Trino for fast distributed SQL queries across any data source without moving data.
About
Starburst is an enterprise analytics platform built on Trino (formerly known as PrestoSQL), the fast, distributed SQL query engine originally developed at Facebook. By providing a commercially supported, enterprise-hardened version of Trino with additional governance, security, performance, and management features, Starburst enables organizations to query data across heterogeneous data sources without moving or copying data into a central repository.
The data mesh and data lakehouse architectural patterns that Starburst supports are based on a fundamental principle: data should be queried where it lives rather than copied to a central data warehouse. Starburst implements this through its catalog system, which provides connectors to dozens of different data sources including Amazon S3, Google Cloud Storage, Azure Data Lake, Apache Hive, Apache Iceberg, Delta Lake, Apache Hudi, as well as traditional databases like PostgreSQL, MySQL, Oracle, SQL Server, MongoDB, Elasticsearch, Kafka, and many others. Each connected data source appears as a separate catalog in the SQL interface.
Cross-source queries are one of Starburst's most powerful capabilities. Analysts can write a single SQL query that joins tables from different catalogs, such as joining a PostgreSQL operational database table with a Parquet file in S3 with a Redshift table, all in one query. The Trino query engine distributes the query execution, pushes down predicates to each source system where possible, and assembles the results, all transparently from the user's perspective.
The Starburst Data Products feature introduces a governed data access layer built on top of the raw Trino query engine. Data products are defined views with descriptions, owners, access policies, and quality metrics that present curated datasets to data consumers. This catalog-based discovery and governance capability helps organizations build well-organized, trustworthy data ecosystems at scale.
Column masking and row-level security policies in Starburst provide dynamic data access control. Column masking applies obfuscation rules to sensitive columns such as credit card numbers and email addresses based on the requesting user's role. Row-level security restricts which rows each user can see based on policy expressions. These controls are enforced at query time, ensuring that access policies are applied consistently regardless of which BI tool or query client is used.
Starburst Galaxy is the fully managed cloud service that provides Starburst on AWS, Azure, and Google Cloud with automatic scaling, serverless query execution, and simplified cluster management. Starburst Enterprise is the self-managed version for deployment on Kubernetes in customer-controlled infrastructure.
Starburst integrates natively with popular BI tools including Tableau, Looker, Power BI, Superset, and others through JDBC and ODBC drivers, enabling existing BI workflows to leverage Starburst's cross-source query capabilities without changing the BI tool.
Positioning
Starburst is the commercial company behind Trino (formerly PrestoSQL), providing an enterprise-grade analytics engine that queries data wherever it lives without requiring movement or duplication. The platform connects to dozens of data sources — from data lakes and warehouses to relational databases and object storage — and lets analysts run standard SQL across all of them as if they were a single database.
Built for organizations that reject the idea of consolidating everything into one warehouse, Starburst embraces a data mesh and data lakehouse philosophy where data stays in place and compute comes to the data. This approach reduces infrastructure costs, eliminates stale copies, and respects data sovereignty requirements that prevent centralization.
What You Get
- Starburst Galaxy
Fully managed SaaS analytics platform with auto-scaling compute clusters, built-in data cataloging, and usage-based pricing - Starburst Enterprise
Self-managed Trino distribution with enterprise security, fault tolerance, and performance optimizations for on-premises deployment - Data Products
Curated, governed datasets published as reusable products with access controls, quality metrics, and discoverability features - Query Federation
Run single SQL queries that join data across PostgreSQL, S3, Snowflake, MongoDB, Elasticsearch, and 40+ other connectors - Access Control
Fine-grained role-based and attribute-based access control with column masking, row filtering, and audit logging
Core Areas
Data Lakehouse Analytics
High-performance SQL queries on open table formats like Apache Iceberg, Delta Lake, and Hudi stored in cloud object storage
Federated Queries
Cross-system analytics that join data from multiple sources in a single query without ETL pipelines or data movement
Data Mesh Implementation
Tools for publishing, discovering, and governing domain-owned data products across decentralized organizational structures
Cost Optimization
Reducing warehouse spend by running analytical queries directly on cheaper storage tiers without loading data into expensive compute platforms
Why It Matters
The data warehouse approach — moving all data into one system before analyzing it — creates expensive, fragile pipelines and stale copies that erode trust in analytics. Starburst inverts this model by bringing SQL compute to distributed data, which means analysts get fresh results from authoritative sources while organizations avoid the cost of maintaining redundant copies.
As data gravity increases and regulatory requirements constrain data movement, the ability to query in place becomes essential rather than optional. Starburst makes federated analytics performant enough for production workloads, not just occasional exploration.
Reviews
No reviews yet.
Log in to write a review
Related
Forest Admin
Forest Admin is a developer-first admin panel platform that auto-generates a back office from your database schema with full customization via code.
Rowy
Rowy is an open source platform providing a spreadsheet UI for Firebase Firestore with cloud functions, automations, and field type extensions.
Estuary
Estuary Flow is an open source real-time data integration platform for building low-latency CDC pipelines between databases, APIs, and data warehouses.