Skip to main content

Apache Doris Table Design Guide

In Apache Doris, table design directly determines write performance, query efficiency, and storage cost. This chapter is organized around typical design scenarios to help you design a table from scratch or optimize the structure of an existing table.

Design Workflow

Designing a Doris table typically involves the following steps:

  1. Choose a table model: decide how data is stored, deduplicated, and aggregated.
  2. Define columns and data types: choose appropriate types for your business fields.
  3. Plan partitioning and bucketing: distribute data evenly and enable pruning.
  4. Add indexes: accelerate point queries, range scans, or full-text search.
  5. Optimize storage: reduce cost through compression, hybrid row-column storage, or hot/cold tiering.
  6. Evolve the schema: adjust the schema or use auto-increment columns as the business changes.

Creating a Table

Use the CREATE TABLE statement to create a table in Doris. To derive a new table from an existing one, you can use:

Table Name Rules

Table names in Doris are case-sensitive by default and have a maximum length of 64 bytes. You can adjust this with the following configurations:

ConfigurationDefaultDescription
lower_case_table_namesCase-sensitiveConfigured only when the cluster is first initialized; cannot be changed afterward
table_name_length_limit64 bytesSetting it too large is not recommended

Key Table Properties

When creating a table, you can specify common properties in PROPERTIES. See CREATE TABLE for details:

PropertyPurpose
bucketsDetermines the granularity of data distribution within the table
storage_mediumControls the storage medium, such as HDD, SSD, or remote shared storage
replication_numControls the number of data replicas for redundancy and reliability
storage_policyConfigures the hot/cold data tiering migration policy
tip

Table properties apply at the partition level. Modifying table properties only takes effect for partitions created in the future; existing partitions are not affected. See ALTER TABLE PROPERTY and ALTER TABLE DISTRIBUTION for details. Dynamic partitioning can set these properties separately.

Choosing a Table Model

Defining Columns and Data Types

Planning Partitioning and Bucketing

Accelerating Queries: Indexes

Optimizing Storage

Evolving the Schema

Special Tables and Best Practices

Design Considerations

Design ItemDescriptionImpact
Data modelCannot be modified after the table is created; choose the right data model in advanceDetermines subsequent write and query patterns
Number of bucketsThe number of buckets in already-created partitions cannot be modified, but can be adjusted via partition replacement; the number of buckets for future partitions in dynamic partitioning can be modifiedAffects data balance and query parallelism
Column changesAdding or dropping VALUE columns is a lightweight operation (completes in seconds); adding or dropping KEY columns or modifying data types is a heavyweight operationAvoid KEY column changes on large datasets when possible
Storage policyUse hot/cold tiering to migrate cold data to HDD, S3, or HDFSSignificantly reduces storage cost