The content of this document may be from machine translation. If you are interested in assisting with translation and proofreading, please contact us dev@doris.apache.org.

Welcome to

Apache Doris

A Modern, high-performance MPP analytical database

hero

A modern, high-performance and real-time analytical database based on MPP.

Apache Doris is known for its high-performance and ease of use, can return query results under massive amounts of data within only sub-second response times and can support not only highly concurrent point queries but also high-throughput complex ad-hoc queries.

Data
Data

Ingestion

Apache Doris provides rich data ingestion methods and supports importing data from Local File, Socket, AWS S3, Apache Hadoop, Apache Flink, Apache Spark, Apache Kafka, Apache SeaTunnel and other data sources or data processing components.

doris
doris

Data Access

Apache Doris supports dozens of external data sources such as MySQL, Oracle, PostgreSQL, Apache Hive, Apache Iceberg, Elasticsearch, etc, so that you can query directly via Apache Doris even though the data is not stored in Apache Doris storage.

doris
doris

Data Application

Apache Doris supports exporting data to downstream applications via the JDBC standard protocol; and also supports various BI/Client tools to connect to Doris via the MySQL protocol. Based on this, Apache Doris can be well applied in many business areas such as multidimensional reporting, user portrait, ad-hoc querying, real-time dashboard, etc.

doris

Why Choose Apache Doris

  • doris

    Ultimate Query
    Performance

    A highly efficient columnar storage engine and modern MPP architecture combine with intelligent materialized views, vectorized execution engine, and various index acceleration to achieve the ultimate query performance.

  • doris

    Easy
    to Use

    Fully compatible with MySQL protocol and standard SQL, user-friendly. Support online schema change and pre-aggregate rollup, easy to integrate with existing system framework.

  • doris

    Hybrid
    Batch-Stream

    Support efficient import of offline batch data and real-time streaming data, and guarantee second level real-time performance. Multi-Version Concurrency Control(MVCC) combined with import transaction support to resolve read/write conflicts and implement Exactly-Once.

  • doris

    Easy to
    Maintenance

    Highly integrated, with no external component dependencies. Online elastic scaling of cluster size. The highly available system with automatic data recovery for a node failure, and automatic load balancing for data and requests.

  • doris

    Data Ecological
    Diversity

    Support a variety of heterogeneous data sources to load access, with a wide range of big data ecological compatibility, and with the mainstream BI tools to complete the adaptation, to achieve the ecological closure of data processing to data analysis.

  • doris

    Ultra-high
    Concurency

    It can support tens of thousands of users to use at the same time in the real production environment. With a flexible resource management, it can meet both high concurrent point queries and high throughput ad-hoc queries.

Core Features

As a mature analytical database project, Apache Doris has several widely recognized core features that enable extreme query performance in various ways.

  • Vectorized Query Execution

    Using the state-of-art vectorized execution technology, it can give full play to the
    parallel planning ability of modern CPU and improves performance prominently in
    multiple query scenarios.

  • Intelligent Materialized View

    Quering a materialized view which is a pre-computed data set is faster than executing a query in base table and it will automatically match the best one from all materialized views.

  • Column-Oriented Database

    Apache Doris is a column-oriented database, which is better suited to analytical scenarios,
    and it not only reduces the amount of data scanning but also realizes an ultra-high data compression ratio.

  • Rich Index Structure

    Apache Doris has a rich index structure to speed up data reading and filtering and can support ultra-high concurrency of online service business, a single node can support up to thousands of QPS.