Skip to main content

Manual Partitioning

Manual partitioning is a partitioning method in which the user explicitly specifies the partition rules at table creation time. This article describes how to use manual partitioning in Doris, including the syntax, use cases, and best practices for Range partitioning, List partitioning, and NULL partitioning.

Partition column rules

Before using manual partitioning, you need to understand the general rules for partition columns:

  • Partition columns can be one or more columns, and they must be KEY columns.
  • Regardless of the partition column type, partition values must be enclosed in double quotes when written.
  • There is no theoretical upper limit on the number of partitions. However, the default limit is 4096 partitions per table. To exceed this limit, modify the FE configurations max_multi_partition_num and max_dynamic_partition_num.
  • When a table is created without partitioning, the system automatically generates a single partition with the same name as the table that covers the full value range. This partition is invisible to users and cannot be modified or deleted.
  • You cannot create partitions whose ranges overlap.

Range partitioning

Range partitioning divides data into different partitions based on the value range of the partition column. The partition column is typically a time column, which makes it easy to manage old and new data.

Supported column types: DATE, DATETIME, TIMESTAMPTZ, TINYINT, SMALLINT, INT, BIGINT, LARGEINT.

Range partitioning supports the following four forms, suitable for different scenarios:

FormUse case
FIXED RANGEWhen you need precise control over the upper and lower bounds of each partition
LESS THANWhen you only care about the upper bound of a partition and append partitions in chronological order
BATCH RANGEBatch creation of numeric or time partitions with equal step size
MULTI RANGEBatch creation with different step sizes for different time ranges

1. FIXED RANGE

Defines a left-closed, right-open interval for the partition.

Syntax:

PARTITION BY RANGE(col1[, col2, ...])
(
PARTITION partition_name1 VALUES [("k1-lower1", "k2-lower1", "k3-lower1",...), ("k1-upper1", "k2-upper1", "k3-upper1", ...)),
PARTITION partition_name2 VALUES [("k1-lower1-2", "k2-lower1-2", ...), ("k1-upper1-2", MAXVALUE, ))
)

Example:

PARTITION BY RANGE(`date`)
(
PARTITION `p201701` VALUES [("2017-01-01"), ("2017-02-01")),
PARTITION `p201702` VALUES [("2017-02-01"), ("2017-03-01")),
PARTITION `p201703` VALUES [("2017-03-01"), ("2017-04-01"))
)

2. LESS THAN

Defines only the upper bound of the partition; the lower bound is determined by the upper bound of the previous partition.

Syntax:

PARTITION BY RANGE(col1[, col2, ...])
(
PARTITION partition_name1 VALUES LESS THAN MAXVALUE | ("value1", "value2", ...),
PARTITION partition_name2 VALUES LESS THAN MAXVALUE | ("value1", "value2", ...)
)

Example:

PARTITION BY RANGE(`date`)
(
PARTITION `p201701` VALUES LESS THAN ("2017-02-01"),
PARTITION `p201702` VALUES LESS THAN ("2017-03-01"),
PARTITION `p201703` VALUES LESS THAN ("2017-04-01"),
PARTITION `p2018` VALUES [("2018-01-01"), ("2019-01-01")),
PARTITION `other` VALUES LESS THAN (MAXVALUE)
)

3. BATCH RANGE

Creates RANGE partitions for numeric and time types in batch, defining a left-closed, right-open interval for each partition with a specified step size.

Syntax:

PARTITION BY RANGE(int_col)
(
FROM (start_num) TO (end_num) INTERVAL interval_value
)

PARTITION BY RANGE(date_col)
(
FROM ("start_date") TO ("end_date") INTERVAL num YEAR | num MONTH | num WEEK | num DAY | 1 HOUR
)

Example:

PARTITION BY RANGE(age)
(
FROM (1) TO (100) INTERVAL 10
)

PARTITION BY RANGE(`date`)
(
FROM ("2000-11-14") TO ("2021-11-14") INTERVAL 2 YEAR
)

4. MULTI RANGE

Creates RANGE partitions in batch with left-closed, right-open intervals, allowing different step sizes within the same statement.

Example:

PARTITION BY RANGE(col)
(
FROM ("2000-11-14") TO ("2021-11-14") INTERVAL 1 YEAR,
FROM ("2021-11-14") TO ("2022-11-14") INTERVAL 1 MONTH,
FROM ("2022-11-14") TO ("2023-01-03") INTERVAL 1 WEEK,
FROM ("2023-01-03") TO ("2023-01-14") INTERVAL 1 DAY,
PARTITION p_20230114 VALUES [('2023-01-14'), ('2023-01-15'))
)

List partitioning

List partitioning divides data into different partitions based on the enumerated values of the partition column. A row is routed to a partition only if its value matches one of the enumerated values defined for that partition.

Supported column types: BOOLEAN, TINYINT, SMALLINT, INT, BIGINT, LARGEINT, DATE, DATETIME, TIMESTAMPTZ, CHAR, VARCHAR.

Syntax keyword: Use VALUES IN (...) to specify the enumerated values contained in each partition.

Single-column List partitioning

PARTITION BY LIST(city)
(
PARTITION `p_cn` VALUES IN ("Beijing", "Shanghai", "Hong Kong"),
PARTITION `p_usa` VALUES IN ("New York", "San Francisco"),
PARTITION `p_jp` VALUES IN ("Tokyo")
)

Multi-column List partitioning

PARTITION BY LIST(id, city)
(
PARTITION p1_city VALUES IN (("1", "Beijing"), ("1", "Shanghai")),
PARTITION p2_city VALUES IN (("2", "Beijing"), ("2", "Shanghai")),
PARTITION p3_city VALUES IN (("3", "Beijing"), ("3", "Shanghai"))
)

NULL partitioning

Partition columns must be NOT NULL by default. To use a nullable column as a partition column, set the following session variable:

SET allow_partition_column_nullable = true;

Different partition types handle NULL values as follows:

Partition typeNULL value handling
LIST partitionSupports a true NULL partition
RANGE partition (with LESS THAN)NULL values are placed in the smallest LESS THAN partition
RANGE partition (without LESS THAN)NULL values cannot be inserted

Scenario 1: LIST partition supports NULL

mysql> create table null_list(
-> k0 varchar null
-> )
-> partition by list (k0)
-> (
-> PARTITION pX values in ((NULL))
-> )
-> DISTRIBUTED BY HASH(`k0`) BUCKETS 1
-> properties("replication_num" = "1");
Query OK, 0 rows affected (0.11 sec)

mysql> insert into null_list values (null);
Query OK, 1 row affected (0.19 sec)

mysql> select * from null_list;
+------+
| k0 |
+------+
| NULL |
+------+
1 row in set (0.18 sec)

Scenario 2: RANGE partition routes NULL to the smallest LESS THAN partition

mysql> create table null_range(
-> k0 int null
-> )
-> partition by range (k0)
-> (
-> PARTITION p10 values less than (10),
-> PARTITION p100 values less than (100),
-> PARTITION pMAX values less than (maxvalue)
-> )
-> DISTRIBUTED BY HASH(`k0`) BUCKETS 1
-> properties("replication_num" = "1");
Query OK, 0 rows affected (0.12 sec)

mysql> insert into null_range values (null);
Query OK, 1 row affected (0.19 sec)

mysql> select * from null_range partition(p10);
+------+
| k0 |
+------+
| NULL |
+------+
1 row in set (0.18 sec)

Scenario 3: RANGE partition without LESS THAN cannot accept NULL

mysql> create table null_range2(
-> k0 int null
-> )
-> partition by range (k0)
-> (
-> PARTITION p200 values [("100"), ("200"))
-> )
-> DISTRIBUTED BY HASH(`k0`) BUCKETS 1
-> properties("replication_num" = "1");
Query OK, 0 rows affected (0.13 sec)

mysql> insert into null_range2 values (null);
ERROR 5025 (HY000): Insert has filtered data in strict mode, tracking_url=......