Skip to main content

Local Disk Tiered Storage

Doris supports tiered storage between SSD and HDD. Combined with dynamic partitions, the system can keep hot data on SSD and automatically migrate cold data to HDD based on the hot-cold characteristics of the data, ensuring high-performance read and write of hot data while reducing overall storage cost.

Applicable Scenarios

This document applies to the following scenarios:

  • Table data is partitioned by time and exhibits clear hot-cold access characteristics.
  • The cluster has both SSD and HDD storage media.
  • You want to use SSD to accelerate queries on recent hot data and HDD to save cost on historical cold data.
  • You want dynamic partitions to automatically manage the data lifecycle, avoiding manual migration.

Quick Navigation

  • Core Concepts: the relationship between dynamic partitions and tiered storage.
  • Parameter Reference: how to use hot_partition_num and storage_medium.
  • Usage Example: table creation SQL and partition distribution verification.
  • FAQ: common questions during use.
  • Troubleshooting: handling exceptions such as partition creation failures.

Core Concepts

Tiered storage is implemented based on dynamic partitions. Doris automatically chooses the storage medium according to how active a partition is, and migrates data to the target medium once the cooldown time is reached.

Hot Partitions and Cold Partitions

TypeDescriptionStorage MediumPerformance Characteristics
Hot partitionRecently active, frequently accessed partitionSSDHigh IOPS, low latency
Cold partitionHistorical data, accessed less frequentlyHDDLarge capacity, low cost

How It Works

The execution flow of tiered storage is as follows:

  1. Enable dynamic partitions when creating the table, and set dynamic_partition.storage_medium = HDD.
  2. Use dynamic_partition.hot_partition_num to designate the most recent N partitions as hot partitions, stored on SSD.
  3. The system sets a storage_cooldown_time for each hot partition.
  4. Once the cooldown time is reached, partition data is automatically migrated from SSD to HDD.

For more about dynamic partitions, see Data Partitioning - Dynamic Partition.

Parameter Reference

Tiered storage relies on the following two dynamic partition parameters:

ParameterPurposeDefaultNotes
dynamic_partition.hot_partition_numSpecifies how many of the most recent partitions are hot, stored on SSDNoneMust be used together with storage_medium = HDD
dynamic_partition.storage_mediumSpecifies the final storage medium for dynamic partitionsHDDWhen set to SSD, hot_partition_num no longer takes effect

dynamic_partition.hot_partition_num

  • Function: Specifies the most recent N partitions as hot partitions. These partitions are stored on SSD, while the remaining partitions are stored on HDD.
  • Conditions for use:
    • You must also set dynamic_partition.storage_medium = HDD. Otherwise this parameter does not take effect.
    • An SSD device must exist under the storage path. Otherwise partition creation fails.

Example:

Assume the current date is 2021-05-20, partitions are by day, and the dynamic partition configuration is as follows:

dynamic_partition.hot_partition_num = 2
dynamic_partition.start = -3
dynamic_partition.end = 3

The system automatically creates the following partitions, with the corresponding storage medium and cooldown time:

p20210517: ["2021-05-17", "2021-05-18") storage_medium=HDD storage_cooldown_time=9999-12-31 23:59:59
p20210518: ["2021-05-18", "2021-05-19") storage_medium=HDD storage_cooldown_time=9999-12-31 23:59:59
p20210519: ["2021-05-19", "2021-05-20") storage_medium=SSD storage_cooldown_time=2021-05-21 00:00:00
p20210520: ["2021-05-20", "2021-05-21") storage_medium=SSD storage_cooldown_time=2021-05-22 00:00:00
p20210521: ["2021-05-21", "2021-05-22") storage_medium=SSD storage_cooldown_time=2021-05-23 00:00:00
p20210522: ["2021-05-22", "2021-05-23") storage_medium=SSD storage_cooldown_time=2021-05-24 00:00:00
p20210523: ["2021-05-23", "2021-05-24") storage_medium=SSD storage_cooldown_time=2021-05-25 00:00:00

dynamic_partition.storage_medium

  • Function: Specifies the final storage medium for dynamic partitions. Valid values are HDD (default) or SSD.
  • Notes:
    • When set to SSD, the hot_partition_num parameter is ignored.
    • In this case all partitions use SSD storage, and the cooldown time is uniformly set to 9999-12-31 23:59:59, meaning no migration occurs.

Usage Example

The following steps show how to create a table that supports tiered storage and verify the storage medium distribution of the partitions.

Step 1: Create a Tiered Storage Table

Goal: Create a table with SSD/HDD tiered storage enabled, where the most recent 2 partitions use SSD and the rest use HDD.

CREATE TABLE tiered_table (k DATE)
PARTITION BY RANGE(k)()
DISTRIBUTED BY HASH (k) BUCKETS 5
PROPERTIES
(
"dynamic_partition.storage_medium" = "hdd",
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "DAY",
"dynamic_partition.hot_partition_num" = "2",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.buckets" = "5",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.start" = "-3"
);

Step 2: Check the Partition Storage Medium

Goal: Confirm that partitions are assigned to SSD and HDD as expected.

SHOW PARTITIONS FROM tiered_table;

Expected output: 7 partitions in total, of which 5 use SSD and 2 use HDD.

p20210517: ["2021-05-17", "2021-05-18") storage_medium=HDD storage_cooldown_time=9999-12-31 23:59:59
p20210518: ["2021-05-18", "2021-05-19") storage_medium=HDD storage_cooldown_time=9999-12-31 23:59:59
p20210519: ["2021-05-19", "2021-05-20") storage_medium=SSD storage_cooldown_time=2021-05-21 00:00:00
p20210520: ["2021-05-20", "2021-05-21") storage_medium=SSD storage_cooldown_time=2021-05-22 00:00:00
p20210521: ["2021-05-21", "2021-05-22") storage_medium=SSD storage_cooldown_time=2021-05-23 00:00:00
p20210522: ["2021-05-22", "2021-05-23") storage_medium=SSD storage_cooldown_time=2021-05-24 00:00:00
p20210523: ["2021-05-23", "2021-05-24") storage_medium=SSD storage_cooldown_time=2021-05-25 00:00:00

FAQ

Q1: What if hot_partition_num does not take effect?

Verify that dynamic_partition.storage_medium = HDD is also set. The hot partition configuration only takes effect when the final medium is HDD.

Q2: Can I use SSD storage only?

Yes. Set dynamic_partition.storage_medium to SSD, and all partitions use SSD with no cooldown migration. In this case there is no need to configure hot_partition_num.

Q3: How is data migrated after the cooldown time is reached?

When a partition's storage_cooldown_time is reached, the system automatically migrates the partition data from SSD to HDD without manual intervention.

Q4: What is the difference between tiered storage and hot-cold data archiving (such as object storage)?

SSD/HDD tiered storage is used for data movement between different local disk media, suitable for short-term to mid-term hot-cold separation. To archive historical data to object storage (S3, HDFS, and so on), see the documentation on hot-cold tiered storage.

Troubleshooting

Error SymptomPossible CauseSolution
Partition creation failsNo SSD device under the storage pathConfigure an SSD storage path on the BE node, or switch to HDD-only storage
hot_partition_num does not take effectstorage_medium = HDD is not setAlso configure dynamic_partition.storage_medium = HDD
All partitions are SSD, no cooldown to HDDstorage_medium is set to SSDChange storage_medium to HDD and configure hot_partition_num
Data is not migrated to HDD as expectedstorage_cooldown_time has not been reachedWait for the cooldown time to be reached, or check that the time setting is correct