Storage Format V3
Doris Storage Format V3 is a major evolution from the Segment V2 format. Through metadata decoupling and encoding strategy optimization, it specifically improves performance for wide tables, complex data types (such as Variant), and cloud-native storage-compute separation scenarios.
Key Optimizations
External Column Meta
- Background: In Segment V2, metadata for all columns (
ColumnMetaPB) is stored in the Footer of the Segment file. For wide tables with thousands of columns or auto-scaling Variant scenarios, the Footer can grow to several megabytes. - Optimization: V3 decouples
ColumnMetaPBfrom the Footer and stores it in a separate area within the file (External Column Meta Area). - Benefits:
- Ultra-fast Metadata Loading: Significantly reduces Segment Footer size, speeding up initial file opening.
- On-demand Loading: Metadata can be loaded on demand from the independent area, reducing memory usage and improving cold start query performance on object storage (like S3/OSS).
Integer Type Plain Encoding
- Optimization: V3 defaults to
PLAIN_ENCODING(raw binary storage) for numerical types (such asINT,BIGINT), instead of the traditional BitShuffle. - Benefits: Combined with LZ4/ZSTD compression,
PLAIN_ENCODINGprovides higher read throughput and lower CPU overhead. In modern high-speed IO environments, this "trading decompression for performance" strategy offers a clear advantage when scanning large volumes of data.
Binary Plain Encoding V2
- Optimization: Introduces
BINARY_PLAIN_ENCODING_V2, using a[length(varuint)][raw_data]streaming layout, replacing the old format that relied on trailing offset tables. - Benefits: Eliminates large trailing offset tables, making data storage more compact and significantly reducing storage consumption for string and JSONB types.
Design Philosophy
The design philosophy of V3 can be summarized as: "Metadata Decoupling, Encoding Simplification, and Streaming Layout". By reducing metadata processing bottlenecks and leveraging the high efficiency of modern CPUs in processing simple encodings, it achieves high-performance analysis under complex schemas.
Use Cases
- Wide Tables: Tables with more than 2000 columns or long column names.
- Semi-structured Data: Heavy use of
VARIANTorJSONtypes. - Tiered Storage/Cloud Native: Scenarios sensitive to object storage loading latency.
- High-performance Scanning: Analytical tasks with extreme requirements for scan throughput.
Usage
Enable When Creating a New Table
Specify storage_format as V3 in the PROPERTIES of the CREATE TABLE statement:
CREATE TABLE table_v3 (
id BIGINT,
data VARIANT
)
DISTRIBUTED BY HASH(id) BUCKETS 32
PROPERTIES (
"storage_format" = "V3"
);