Release 2.1.7
Dear community, Apache Doris version 2.1.7 was officially released on November 10, 2024. This version brings continuous upgrades and improvementsAdditionally, several fixes have been implemented in areas such as the to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management, Query Optimizer and Permission Management.
Quick Download: https://doris.apache.org/download/
GitHub Release: https://github.com/apache/doris/releases
Behavior changes
- The following global variables will be forcibly set to the following default values:
- enable_nereids_dml: true
- enable_nereids_dml_with_pipeline: true
- enable_nereids_planner: true
- enable_fallback_to_original_planner: true
- enable_pipeline_x_engine: true
- New columns have been added to the audit log. #42262
- For more information, please refer to docs
New features
Async Materialized View
- An asynchronous materialized view has added a property called
use_for_rewrite
to control whether it participates in transparent rewriting. #40332
Query Execution
- The list of changed session variables is now output in the Profile. #41016
- Support for
trim_in
,ltrim_in
, andrtrim_in
functions has been added. #42641 (Note: This is a duplicate mention, but I'm including it as per your original list.) - Support for several URL functions (top_level_domain, first_significant_subdomain, cut_to_first_significant_subdomain) has been added. #42916
- The
bit_set
function has been added. #42916 - The
count_substrings
function has been added. #42055 - The
translate
andurl_encode
functions have been added. #41051 - The
normal_cdf
,to_iso8601
, andfrom_iso8601_date
functions have been added. #40695
Storage Management
- The
information_schema.table_options
andtable_properties
system tables have been added, supporting the querying of attributes set during table creation. #34384 - Support for
bitmap_empty
as a default value has been implemented. #40364 - A new session variable
require_sequence_in_insert
has been introduced to control whether a sequence column must be provided when performingINSERT INTO SELECT
writes to a unique key table. #41655
Others
- Allow for generating flame graphs on the BE WebUI page.#41044
Improvements
Lakehouse
- Support for writing data to Hive text format tables. #40537
- For more information, please refer to docs
- Access MaxCompute data using MaxCompute Open Storage API. #41610
- For more information, please refer to docs
- Support for Paimon DLF Catalog. #41694
- For more information, please refer to docs
- Added
table$partitions
syntax to directly query Hive partition information.#41230- For more information, please refer to docs
- Support for reading Parquet files in brotli compression format.#42162
- Support for reading DECIMAL 256 types in Parquet files. #42241
- Support for reading Hive tables in OpenCsvSerde format.#42939
Async Materialized View
- Refined the granularity of lock holding during the build process for asynchronous materialized views. #40402 #41010.
Query optimizer
- Improved the accuracy of statistic information collection and usage in extreme cases to enhance planning stability. #40457
- Runtime filters can now be generated in more scenarios to improve query performance. #40815
- Enhanced constant folding capabilities for numerical, date, and string functions to boost query performance. #40820
- Optimized the column pruning algorithm to enhance query performance. #41548
Query Execution
- Supported parallel preparation to reduce the time consumed by short queries. #40270
- Corrected the names of some counters in the profile to match the audit logs. #41993
- Added new local shuffle rules to speed up certain queries. #40637
Storage Management
- The
SHOW PARTITIONS
command now supports displaying the commit version. #28274 - Checked for unreasonable partition expressions when creating tables. #40158
- Optimized the scheduling logic when encountering EOF in Routine Load. #40509
- Made Routine Load aware of schema changes. #40508
- Improved the timeout logic for Routine Load tasks. #41135
Others
- Allowed closing the built-in service port of BRPC via BE configuration. #41047
- Fixed issues with missing fields and duplicate records in audit logs. #43015
Bug fixes
Lakehouse
- Fixed the inconsistency in the behavior of INSERT OVERWRITE with Hive. #39840
- Cleaned up temporarily created folders to address the issue of too many empty folders on HDFS. #40424
- Resolved memory leaks in FE caused by using the JDBC Catalog in some cases. #40923
- Resolved memory leaks in BE caused by using the JDBC Catalog in some cases. #41266
- Fixed errors in reading Snappy compressed formats in certain scenarios. #40862
- Addressed potential FileSystem leaks on the FE side in certain scenarios. #41108
- Resolved issues where using EXPLAIN VERBOSE to view external table execution plans could cause null pointer exceptions in some cases. [#41231] (https://github.com/apache/doris/pull/41231)
- Fixed the inability to read tables in Paimon parquet format. #41487
- Addressed performance issues introduced by compatibility changes in the JDBC Oracle Catalog. #41407
- Disabled predicate pushing down after implicit conversion to resolve incorrect query results in some cases with JDBC Catalog. #42242
- Fixed issues with case-sensitive access to table names in the External Catalog. #42261
Async Materialized View
- Fixed the issue where user-specified start times were not effective. #39573
- Resolved the issue of nested materialized views not refreshing. #40433
- Fixed the issue where materialized views might not refresh after the base table was deleted and recreated. #41762
- Addressed issues where partition compensation rewrites could lead to incorrect results. #40803
- Fixed potential errors in rewrite results when
sql_select_limit
was set. #40106
Semi-Structured Data Management
- Fixed the issue of index file handle leaks. #41915
- Addressed inaccuracies in the
count()
function of inverted indexes in special cases. (#41127)[https://github.com/apache/doris/pull/41127] - Fixed exceptions with variant when light schema change was not enabled. #40908
- Resolved memory leaks when variant returns arrays. #41339
Query optimizer
- Corrected potential errors in nullable calculations for filter conditions during external table queries, leading to execution exceptions. #41014
- Fixed potential errors in optimizing range comparison expressions. #41356
Query Execution
- The match_regexp function could not correctly handle empty strings. #39503
- Resolved issues where the scanner thread pool could become stuck in high-concurrency scenarios. #40495
- Fixed errors in the results of the
data_floor
function. #41948 - Addressed incorrect cancel messages in some scenarios. #41798
- Fixed issues with excessive warning logs printed by arrow flight. #41770
- Resolved issues where runtime filters failed to send in some scenarios. #41698
- Fixed problems where some system table queries could not end normally or became stuck. #41592
- Addressed incorrect results from window functions. ]#40761
- Fixed issues where the encrypt and decrypt functions caused BE cores. #40726
- Resolved errors in the results of the conv function. #40530
Storage Management
- Fixed import failures when Memtable migration was used in multi-replica scenarios with machine crashes. #38003
- Addressed inaccurate memory statistics during the Memtable flush phase during imports. #39536
- Fixed fault tolerance issues with Memtable migration in multi-replica scenarios. #40477
- Resolved inaccurate bvar statistics with Memtable migration. #40985
- Fixed inaccurate progress reporting for S3 loads. #40987
Permissions
- Fixed permission issues related to show columns, show sync, and show data from db.table. #39726
Others
- Fixed the issue where the audit log plugin for version 2.0 could not be used in version 2.1. #41400