Blog/Release Notes

Apache Doris 2.1.8 just released

Apache Doris2025年1月24日

Dear Community, Apache Doris version 2.1.8 was officially released on January 24, 2025. This version introduces continuous upgrades and enhancements in several key areas, including Lakehouse, Asynchronous Materialized Views, Query Optimizer and Execution Engine, Storage Management, and more.

Behavior Changes

When querying a data source with case-insensitive table names (such as Hive) through External Catalog, in previous versions, you can use any case to query the table name, but in version 2.1.8, Doris's own table name case sensitivity policy will be strictly followed.
Add the environment variable SKIP_CHECK_ULIMIT to skip the ulimit value verification check within the BE process. This is only available to applications in the Docker quick start scenario. #45267
Add the enable_cooldown_replica_affinity session variable to control the selection of replica affinity for queries under cold - hot separation.
In FE, add the configurations restore_job_compressed_serialization and backup_job_compressed_serialization to solve the OOM problem of FE during backup and restore operations when the number of db tablets is extremely large. By default, these configurations are disabled, and once enabled, they cannot be downgraded.

New Features

The Arrow flight protocol supports accessing BE through a load - balancing device. #43281
Now lambda expressions support capturing external columns (#45186).#45186

Improvements

Lakehouse

Update the Hudi version to 0.15. And optimize the query planning performance of Hudi tables.
Optimize the read performance of MaxCompute partitioned tables. #45148
Support the session variable enable_text_validate_utf8, which can ignore the UTF8 encoding detection in CSV format. #45537
Optimize the performance of Parquet file lazy materialization under high - filtering - rate conditions. #46183

Asynchronous Materialized Views

Now it supports manually refreshing partitions that do not exist in an asynchronous materialized view #45290.
Optimize the performance of transparent rewrite planning #44786.

Query Optimizer

Improve the adaptive ability of runtime filters #42640.
Add the ability to generate original column filter conditions from filter conditions on max/min aggregate function columns #39252
Add the ability to extract single - side filter conditions from join predicates #38479.
Optimize the ability of predicate derivation on set operators to better generate filter predicates #39450.
Optimize the exception handling ability of statistic information collection and usage to avoid generating unexpected execution plans when collection exceptions occur. #43009 #43776 #43865 #42104 #42399 #41729

Query Execution Engine

Optimize the execution of queries with limit to end faster and avoid unnecessary data scanning #44255.

Storage Management

CCR supports more comprehensive operations, such as rename table, rename column, modify comment, drop view, drop rollup, etc.
Improve the accuracy of the broker load import progress and the performance when importing multiple compressed files.
Improve the routine load timeout strategy and thread - pool usage to prevent routine load timeout failures and impacts on queries.

Others

The Docker quick - start image supports starting without setting environment parameters. Add the environment variable SKIP_CHECK_ULIMIT to skip the start_be.sh script and the swap, max_map_count, ulimit - related verification checks within the BE process. This is only applicable to applications in the Docker quick - start scenario. #45269
Add the new LDAP configuration ldap_group_filter for custom group filtering. #43292
Optimize performance when using ranger. #41207
Fix the inaccurate statistics of scan bytes in the audit log. #45167
Now, the default values of columns can be correctly displayed in the COLUMNS system table. #44849
Now, the definition of views can be correctly displayed in the VIEWS system table. #45857
Now, the admin user cannot be deleted. #44751

Bug Fixes

Lakehouse

Hive

Fix the problem of being unable to query Hive views created by Spark. #43553
Fix the problem of being unable to correctly read some Hive Transaction tables. #45753
Fix the problem of incorrect partition pruning when Hive table partitions contain special characters. #42906

Iceberg

Fix the problem of being unable to create Iceberg tables in a Kerberos - authenticated environment. #43445
Fix the problem of inaccurate count(*) queries when there are dangling deletes in Iceberg tables in some cases. #44039
Fix the problem of query errors due to column name mismatches in Iceberg tables in some cases . #44470
Fix the problem of being unable to read Iceberg tables when their partitions are modified in some cases .#45367

Paimon

Fix the problem that the Paimon Catalog cannot access Alibaba Cloud OSS - HDFS. #42585

Hudi

Fix the problem of ineffective partition pruning in Hudi tables in some cases. #44669

JDBC

Fix the problem of being unable to obtain tables using the JDBC Catalog after enabling the case insensitive table name feature in some cases.

MaxCompute

Fix the problem of ineffective partition pruning in MaxCompute tables in some cases#44508.

Others

Fix the problem of FE memory leaks caused by EXPORT tasks in some cases.#44019
Fix the problem of being unable to access S3 object storage using the https protocol in some cases #44242.
Fix the problem of the inability to automatically refresh Kerberos authentication tickets in some cases #44916
Fix the problem of errors when reading Hadoop Block compressed format files in some cases. #45289
When querying ORC - formatted data, no longer push down CHAR - type predicates to avoid possible result errors. #45484

Asynchronous Materialized Views

Fix the problem that when there is a CTE in the materialized view definition, it cannot be refreshed #44857.
Fix the problem that when columns are added to the base table, the asynchronous materialized view cannot hit the transparent rewrite. #44867
Fix the problem that when the same filter predicate is included in different positions in a query, the transparent rewrite fails. #44575
Fix the problem that when column aliases are used in filter predicates or join predicates, the transparent rewrite cannot be performed. #44779

Inverted Index

Fix the problem of abnormal handling of inverted index compaction. #45773
Fix the problem that inverted index construction fails due to lock - waiting timeout. #43589
Fix the problem of inverted index write crashes in abnormal situations. #46075
Fix the null - pointer problem of the match function with special parameters. #45774
Fix problems related to the variant inverted index and disable the use of the index v1 format for variants #43971 #45179
Fix the problem of crashes when setting gram_size = 65535 for the ngram bloomfilter index #43654
Fix the problem of incorrect calculation of DATE and DATETIME for the bloomfilter index #43622
Fix the problem that dropping a column does not automatically drop the bloomfilter index #44478
Reduce the memory footprint when writing the bloomfilter index #46047

Semi-Structure Data

Optimize memory usage and reduce the memory consumption of the variant data type #43349 #44585 #45734
Optimize the performance of variant schema copy. #45731
Do not use variant as a key when automatically inferring tablet keys. #44736
Fix the problem of changing variant from NOT NULL to NULL #45734
Fix the problem of incorrect type inference of lambda functions. #45798
Fix the coredump problem at the boundary conditions of the ipv6_cidr_to_range function #46252

Query Optimizer

Fix the potential deadlock problem caused by mutual exclusion of table read locks and optimize the lock - using logic #45045 #43376 #44164 #44967 #45995.
Fix the problem that the SQL Cache function incorrectly uses constant folding, resulting in incorrect results when using functions containing time formats . #44631
Fix the problem of incorrect optimization of comparison expressions in edge cases, which may lead to incorrect results. #44054 #44725 #44922 #45735 #45868
Fix the problem of incorrect audit logs for high - concurrent point queries #43345 #44588
Fix the problem of continuous error reporting after an exception occurs in high - concurrent point queries #44582
Fix the problem of incorrectly prepared statements for some fields.#45732

Query Execution Engine

Fix the problem of incorrect results of regular expressions and like functions for special characters. #44547
Fix the problem that the SQL Cache may have incorrect results when switching databases. #44782
Fix the problem of incorrect results of the cut_ipv6 function. #43921
Fix the problem of casting from numeric types to bool types. #46275
Fix a series of problems related to arrow flight. #45661 #45023 #43960 #43929
Fix the problem of incorrect results in some cases when the hash table of hash join exceeds 4G. #46461
Fix the overflow problem of the convert_to function for Chinese characters. #46505

Storage Management

Fix the problem that high - concurrent DDL may cause FE startup failure.
Fix the problem that auto - increment columns may have duplicate values.
Fix the problem that routine load cannot use the newly expanded BE during expansion.

Permission Management

Fix the problem of frequent access to the Ranger service when using Ranger as the authentication plugin #45645.

Others

Fix the potential memory leak problem when enable_jvm_monitor=true is enabled on the BE side #44311.

Behavior Changes​

New Features​

Improvements​

Lakehouse​

Asynchronous Materialized Views​

Query Optimizer​

Query Execution Engine​

Storage Management​

Others​

Bug Fixes​

Lakehouse​

Hive​

Iceberg​

Paimon​

Hudi​

JDBC​

MaxCompute​

Others​

Asynchronous Materialized Views​

Inverted Index​

Semi-Structure Data​

Query Optimizer​

Query Execution Engine​

Storage Management​

Permission Management​

Others​

Behavior Changes

New Features

Improvements

Lakehouse

Asynchronous Materialized Views

Query Optimizer

Query Execution Engine

Storage Management

Others

Bug Fixes

Lakehouse

Hive

Iceberg

Paimon

Hudi

JDBC

MaxCompute

Others

Asynchronous Materialized Views

Inverted Index

Semi-Structure Data

Query Optimizer

Query Execution Engine

Storage Management

Permission Management

Others