ITithub.directory
Directory
Starburst

Starburst

API

Starburst is an enterprise data analytics platform built on Trino for fast distributed SQL queries across any data sourc

www.starburst.io

Last updated: April 2026

Starburst is an enterprise data analytics platform built on Trino for fast distributed SQL queries across any data source without moving data.

2views

About

Starburst is an enterprise analytics platform built on Trino (formerly known as PrestoSQL), the fast, distributed SQL query engine originally developed at Facebook. By providing a commercially supported, enterprise-hardened version of Trino with additional governance, security, performance, and management features, Starburst enables organizations to query data across heterogeneous data sources without moving or copying data into a central repository.

The data mesh and data lakehouse architectural patterns that Starburst supports are based on a fundamental principle: data should be queried where it lives rather than copied to a central data warehouse. Starburst implements this through its catalog system, which provides connectors to dozens of different data sources including Amazon S3, Google Cloud Storage, Azure Data Lake, Apache Hive, Apache Iceberg, Delta Lake, Apache Hudi, as well as traditional databases like PostgreSQL, MySQL, Oracle, SQL Server, MongoDB, Elasticsearch, Kafka, and many others. Each connected data source appears as a separate catalog in the SQL interface.

Cross-source queries are one of Starburst's most powerful capabilities. Analysts can write a single SQL query that joins tables from different catalogs, such as joining a PostgreSQL operational database table with a Parquet file in S3 with a Redshift table, all in one query. The Trino query engine distributes the query execution, pushes down predicates to each source system where possible, and assembles the results, all transparently from the user's perspective.

The Starburst Data Products feature introduces a governed data access layer built on top of the raw Trino query engine. Data products are defined views with descriptions, owners, access policies, and quality metrics that present curated datasets to data consumers. This catalog-based discovery and governance capability helps organizations build well-organized, trustworthy data ecosystems at scale.

Column masking and row-level security policies in Starburst provide dynamic data access control. Column masking applies obfuscation rules to sensitive columns such as credit card numbers and email addresses based on the requesting user's role. Row-level security restricts which rows each user can see based on policy expressions. These controls are enforced at query time, ensuring that access policies are applied consistently regardless of which BI tool or query client is used.

Starburst Galaxy is the fully managed cloud service that provides Starburst on AWS, Azure, and Google Cloud with automatic scaling, serverless query execution, and simplified cluster management. Starburst Enterprise is the self-managed version for deployment on Kubernetes in customer-controlled infrastructure.

Starburst integrates natively with popular BI tools including Tableau, Looker, Power BI, Superset, and others through JDBC and ODBC drivers, enabling existing BI workflows to leverage Starburst's cross-source query capabilities without changing the BI tool.

Positioning

Starburst is the commercial company behind Trino (formerly PrestoSQL), providing an enterprise-grade analytics engine that queries data wherever it lives without requiring movement or duplication. The platform connects to dozens of data sources — from data lakes and warehouses to relational databases and object storage — and lets analysts run standard SQL across all of them as if they were a single database.

Built for organizations that reject the idea of consolidating everything into one warehouse, Starburst embraces a data mesh and data lakehouse philosophy where data stays in place and compute comes to the data. This approach reduces infrastructure costs, eliminates stale copies, and respects data sovereignty requirements that prevent centralization.

What You Get

  • Starburst Galaxy
    Fully managed SaaS analytics platform with auto-scaling compute clusters, built-in data cataloging, and usage-based pricing
  • Starburst Enterprise
    Self-managed Trino distribution with enterprise security, fault tolerance, and performance optimizations for on-premises deployment
  • Data Products
    Curated, governed datasets published as reusable products with access controls, quality metrics, and discoverability features
  • Query Federation
    Run single SQL queries that join data across PostgreSQL, S3, Snowflake, MongoDB, Elasticsearch, and 40+ other connectors
  • Access Control
    Fine-grained role-based and attribute-based access control with column masking, row filtering, and audit logging

Core Areas

Data Lakehouse Analytics

High-performance SQL queries on open table formats like Apache Iceberg, Delta Lake, and Hudi stored in cloud object storage

Federated Queries

Cross-system analytics that join data from multiple sources in a single query without ETL pipelines or data movement

Data Mesh Implementation

Tools for publishing, discovering, and governing domain-owned data products across decentralized organizational structures

Cost Optimization

Reducing warehouse spend by running analytical queries directly on cheaper storage tiers without loading data into expensive compute platforms

Why It Matters

The data warehouse approach — moving all data into one system before analyzing it — creates expensive, fragile pipelines and stale copies that erode trust in analytics. Starburst inverts this model by bringing SQL compute to distributed data, which means analysts get fresh results from authoritative sources while organizations avoid the cost of maintaining redundant copies.

As data gravity increases and regulatory requirements constrain data movement, the ability to query in place becomes essential rather than optional. Starburst makes federated analytics performant enough for production workloads, not just occasional exploration.

Reviews

No reviews yet.

Log in to write a review