Last updated on May 22, 2026

Apache Doris Table Design Guide

In Apache Doris, table design directly determines write performance, query efficiency, and storage cost. This chapter is organized around typical design scenarios to help you design a table from scratch or optimize the structure of an existing table.

Design Workflow

Designing a Doris table typically involves the following steps:

Choose a table model: decide how data is stored, deduplicated, and aggregated.
Define columns and data types: choose appropriate types for your business fields.
Plan partitioning and bucketing: distribute data evenly and enable pruning.
Add indexes: accelerate point queries, range scans, or full-text search.
Optimize storage: reduce cost through compression, hybrid row-column storage, or hot/cold tiering.
Evolve the schema: adjust the schema or use auto-increment columns as the business changes.

Creating a Table

Use the CREATE TABLE statement to create a table in Doris. To derive a new table from an existing one, you can use:

CREATE TABLE ... LIKE: reuse the structure of an existing table.
CREATE TABLE ... AS SELECT (CTAS): create a table based on the result of a query.

Table Name Rules

Table names in Doris are case-sensitive by default and have a maximum length of 64 bytes. You can adjust this with the following configurations:

Configuration	Default	Description
lower_case_table_names	Case-sensitive	Configured only when the cluster is first initialized; cannot be changed afterward
table_name_length_limit	64 bytes	Setting it too large is not recommended

Key Table Properties

When creating a table, you can specify common properties in PROPERTIES. See CREATE TABLE for details:

Property	Purpose
`buckets`	Determines the granularity of data distribution within the table
`storage_medium`	Controls the storage medium, such as HDD, SSD, or remote shared storage
`replication_num`	Controls the number of data replicas for redundancy and reliability
`storage_policy`	Configures the hot/cold data tiering migration policy

tip

Table properties apply at the partition level. Modifying table properties only takes effect for partitions created in the future; existing partitions are not affected. See ALTER TABLE PROPERTY and ALTER TABLE DISTRIBUTION for details. Dynamic partitioning can set these properties separately.

Choosing a Table Model

Table Model Overview

Compare the Duplicate, Aggregate, and Unique models, and choose the right data storage and deduplication approach for your scenario

Duplicate Model

Preserve raw detail data, suitable for logs, events, and other scenarios that do not require deduplication or aggregation

Unique Model

Deduplicate by primary key and support high-concurrency real-time updates, suitable for CDC, orders, and other scenarios where data is updated by primary key

Aggregate Model

Pre-aggregate data by dimensions to improve performance for reporting and multi-dimensional analysis queries

Model Selection Tips

Recommendations for table models and Key column design based on typical business scenarios

Defining Columns and Data Types

Data Types

Reference all data types supported by Doris, including numeric, string, date, and semi-structured types

Planning Partitioning and Bucketing

Data Distribution Concepts

Understand how partitioning and bucketing map data to tablets to fully leverage the storage and compute capabilities of multiple nodes

Basic Concepts

Explanations of basic terms related to data distribution, including partition, bucket, tablet, and replica

Manual Partitioning

Explicitly define partition ranges at table creation, suitable for scenarios with fixed partitions and a need for fine-grained control

Dynamic Partitioning

Automatically create and reclaim partitions along the time dimension to simplify the operations of time-series tables

Auto Partitioning

Automatically create partitions based on incoming data, eliminating the need to define partition ranges in advance

Data Bucketing

Choose appropriate bucketing columns and the number of buckets so that data is evenly distributed within partitions

Common Issues

Common mistakes and troubleshooting methods in partitioning and bucketing design

Accelerating Queries: Indexes

Index Overview

Compare the various index types supported by Doris and their applicable query scenarios

Prefix Index

A built-in index based on the sort key that accelerates equality and range queries

BloomFilter Index

Accelerate equality queries on high-cardinality columns and reduce unnecessary data scans

NGram BloomFilter Index

Accelerate LIKE fuzzy match queries without introducing a full inverted index

Inverted Index (Full-Text Search)

Support tokenized full-text search and keyword and phrase matching, suitable for log and text analysis

Vector Index

Accelerate similarity search on high-dimensional vectors based on algorithms such as HNSW and IVF

Optimizing Storage

Storage Format

Learn about the design and advantages of the Doris columnar storage format V3

Data Compression

Choose an appropriate compression algorithm to balance storage cost and query performance

Hybrid Row-Column Storage

Add row storage on top of column storage to accelerate point queries on wide tables

Hot/Cold Data Tiering

Move cold data down to SSD/HDD or remote storage such as S3 and HDFS to reduce storage cost

Evolving the Schema

Schema Change

Add, drop, or modify columns and types with ALTER TABLE, and learn the difference between lightweight and heavyweight changes

Auto-Increment Column

Use an Auto Increment Column to generate unique numeric IDs automatically and simplify data writes

Special Tables and Best Practices

Temporary Tables

Break complex SQL into multiple steps and stage intermediate results for easier debugging and reuse

Table Design Best Practices

A summary of key design recommendations for table models, Key columns, partitioning and bucketing, indexes, and more

Design Considerations

Design Item	Description	Impact
Data model	Cannot be modified after the table is created; choose the right data model in advance	Determines subsequent write and query patterns
Number of buckets	The number of buckets in already-created partitions cannot be modified, but can be adjusted via partition replacement; the number of buckets for future partitions in dynamic partitioning can be modified	Affects data balance and query parallelism
Column changes	Adding or dropping VALUE columns is a lightweight operation (completes in seconds); adding or dropping KEY columns or modifying data types is a heavyweight operation	Avoid KEY column changes on large datasets when possible
Storage policy	Use hot/cold tiering to migrate cold data to HDD, S3, or HDFS	Significantly reduces storage cost

Design Workflow​

Creating a Table​

Table Name Rules​

Key Table Properties​

Choosing a Table Model​