New Features
Query & Execution
- Support function ARRAY_CROSS_PRODUCT (#64031)
- Add exponential_moving_average aggregate function (#63499)
- Support murmur_hash3_128 function (#63196)
- Add datasketches HLL sketch aggregate functions (#63143)
- Skip collecting stats for long string columns (#62686)
- Add information_schema role mappings table (#62077)
- Add FE constant folding for cosine_similarity and standardize test patterns (#60403)
Cloud Native
- Show compute group for MTMV refresh task (#63206)
Security & Authentication
- Integrate OIDC authentication, MySQL login bridge, and role mapping (#61819)
Improvement
Query & Execution
- Use real elapsed time to compute workload group metrics refresh interval (#63537)
- Support TIMESTAMPTZ in multiple aggregate and array functions (#62756)
Storage & Compaction
- Enable packed file and empty rowset optimization by default (#63475)
- Support configurable S3 credentials providers (#62788)
- Add enable_recycler config to skip recycler dynamically (#63286)
Bugfix
Index & Search
- Fix reading packed inverted index file on file cache miss (#64383)
- Fix ANN IVF/PQ recall, avoid init-time large ANN build-buffer reservation, and skip ANN index bui... (#64082)
- Guard LogicalView.computeOutput() against schema drift (IndexOutOfBoundsException) (#64007)
- Fix ANN range search state leakage and incorrect slot index tracking. (#63666)
- Clamp variant_sparse_hash_shard_count to >=1 in SHOW CREATE output (#63661)
- Bind Variant search to nested indexes (#63660)
- Reject Lucene-syntax SEARCH on columns without inverted index (#63637)
- Preserve operative slots when deep copying logical relations (#63315)
- Preserve variant subfields in view definitions to fix select view result wrong when view select h... (#62907)
- Fix tokenize function incorrect result when first argument is const (#62699)
- Allow MATCH on aliased variant subcolumns (#63772)
- Improving AI function performance (#62494)
Query & Execution
- Reject TopN in correlated scalar subquery (#64251)
- Optimize floating fmod fast path (#64161)
- Make role-mapping keywords RULE/CEL/MAPPING non-reserved (#64104)
- Prevent unsafe runtime filter pushdown through outer joins (#64102)
- Fix datediff folding for zero date (#64084)
- Align convert_tz folding with BE DST handling (#64029)
- Merge struct_element into element_at (#64027)
- Fix TopN runtime filter activation (#63969)
- Preserve negative zero sign in SIGNBIT constant folding (#63954)
- Fix assert row join pushdown alias handling (#63892)
- Fix array subscript on pruned variant subpath (#63891)
- Align COM_RESET_CONNECTION behavior with MySQL (#63884)
- Preserve NaN in numeric constant folding (#63870)
- Avoid potential OOM when reading large snapshot splits (#63833)
- Cast variant subcolumn as json in variant_hirachinal for stable output (#63828)
- Fix changed variable output in show variables (#63734)
- Fix computeDestIdToInstanceId picking wrong ExchangeNode for multi-input fragments (#63615)
- Avoid unsigned underflow in JSON modify path (#63579)
- Handle legacy DecimalV2 segments with missing precision/frac (#63569)
- Preserve TIMESTAMPTZ values in sparse path (#63522)
- Align legacy literal compareLiteral with Nereids ComparableLiteral semantics (#63481)
- Reject COUNT DISTINCT on variant arguments (#63479)
- Compare JSON numeric values by value (#63396)
- Remove unsafe JsonbWriter key overload (#63355)
- Prevent invalid alias rewrite in view definitions (#63353)
- Reject non-positive topn count argument (#63350)
- Fix json contains duplicate array candidates (#63301)
- Reject super wildcard path in json keys (#63300)
- Report TIMESTAMPTZ as string to MySQL clients (#63292)
- Add null reject compensation for join rewrite (#63268)
- Reject lone UTF-16 surrogates in JSONB literals (RFC 8259 §8.2) (#63255)
- Fix alias function with cast outermost expr and reject illegal expressions (#63254)
- Support TIMESTAMPTZ in TopN runtime predicate (#63220)
- Reject JSONB and variant distribution columns (#63211)
- Move the pruning of predicates that are always true after partition pruning into the PlanPostProc... (#63111)
- Fix coalesce function output null (#63092)
- Keep first duplicate Variant JSON path (#63082)
- Avoid unioning query-unused MV partitions (#63081)
- Fix integer typing and prefer Variable.realExpression for argument/type resolution (#62524)
- Fix redundant aggregation in agg-union query plan (#62231)
- Pass ConnectContext to canUseNereidsDistributePlanner method instead of call ConnectContext.get() (#60529)
Storage & Compaction
- Add from-to cdc WAL-search timeout and stale-reader release (#64013)
- Validate recycle rowset key state during commit rowset (#63985)
- Reduce sparse variant parse memory (#63970)
- Avoid retaining full segment key bounds buffers (#63968)
- Release packed file writer buffer after flush (#63967)
- Normalize SC rowset graph before delete bitmap capture (#63960)
- Support cdc_client JVM opts and adopt externally-managed cdc_client (#63898)
- Remove the udf cache expiration_time property (#63897)
- Drain txn lazy committer workers before destruction (#63876)
- Filter nereidsPrunedTabletIds per partition in distributionPrune (#63851)
- Keep prefetch reader alive for async tasks (#63796)
- Avoid retrying object storage SlowDown errors (#63776)
- Remove single replica compaction (#63771)
- Cache cluster id per query and drop redundant locks on getBackendId hot path (#63636)
- Fix meta tool build on master (#63540)
- Skip stale tablet cache check for STOP_TOKEN (#63520)
- Preserve labels for histogram metrics to fix wrong metric name for prometheus (#63485)
- Normalize default HDFS paths in LocationPath (#63476)
- Truncate segment key bounds before storing segment stats (#63469)
- Fill schema change version holes before running (#63443)
- Disable dict encoding in row store columns (#63438)
- Remove pure attribute from assert_cast (#63417)
- Fix clear_file_cache right after reboot causing file cache size percent overflow (#63410)
- Verify DML and 3-replica create table when one BE is down (#63401)
- Fix pre-aggregation context leakage across join branches (#63357)
- Clean empty v3 cache dirs (#63344)
- Avoid concurrent tablet stat iteration failures (#63298)
- Allow show tablet without selected database (#63280)
- Restore split-bound Java types when reading FE-persisted CDC offset (#63219)
- Support storage vault for clone instance (#63217)
- Delay overwrite partition routing until incremental open (#63209)
- Fix simple aggregate cache after partition recycle (#63175)
- Enhance OOM error message for statistics analyze tasks (#63172)
- Fix incorrect memory availability check in RowSourceBuffer during vertical compaction (#63152)
- Async chunk splitting for cdc source job (#63079)
- Replace std::mutex/std::lock_guard with annotated wrappers for thread safety analysis (#63070)
- Avoid NPE for force-finished publish task (#63069)
- Add SubQueue abstraction and thread-safety annotations to DataQueue (#62947)
- Skip wait for async rowset warmup (#62764)
- Cache version and get tablet stats actively for RestoreJob (#62704)
- Support storage vault for create/list snapshot (#62523)
- Infer null-reject from INNER JoinEdge for multi-hop outer join MV rewrite (#62492)
- Avoid crash when late holder cleanup sees removed cache cell (#62437)
- Fix PartitionRebalancer generating invalid moves to BEs without required storage medium (#62206)
- Add system rate limit for meta-service (#61516)
- Replace Tablet references with tabletId in CloudTabletRebalancer (#61233)
- Add async lru update machanism and fix partial hit in cache reader (#61083)
- Avoid duplicated FileCache counter accumulation in NewOlapScanner (#61072)
- Show proc should display partition cached version (#60807)
- Avoid false tablet diagnosis alarms in cloud mode (#60805)
Load
- Fix nullable date literal binding and date *_diff folding (#64127)
- Keep load row metrics monotonic for auto partition (#64109)
- Fix postgres cdc multi-table publication data loss and binlog duplicate key (#64075)
- Match STRUCT sub-fields by name when loading JSON (#64011)
- Change lzo download link (#63785)
- Fix postgres historical-date timestamp handling in cdc-client (#63618)
- Add per-job routine load metrics (#63576)
- Fix int overflow in BeIdComparator causing stream load failure (#63565)
- Add txn write amplification brpc metrics for sub txn load (#63545)
- Enforce explicit compute group form for workload DDLs (#63505)
- Support user-specified mysql server_id with per-reader assignment (#63490)
- Misc fixes for typo/log/validation/visibility (#63480)
- Avoid NPE on cross-table DML during snapshot chunk read (#63435)
- Keep isCanceled set when cancel runs on terminal task (#63427)
- Drop neighbour-table rows leaked by JDBC LIKE wildcards in JdbcPostgreSQLClient (#63402)
- Optimize row-store memtable flush memory in the row-store scenario (#63342)
- Fix broken pipe risk on stream load redirect with unconsumed request body (#63332)
- Pin esdk-obs-java-bundle to 3.21.11 to fix version range resolution failure (#63278)
- Add per-job lag metric to streaming insert jobs (#63194)
- Fix NPE in routine load Kafka meta request (#63180)
- Remove dead code across core types and utilities (#62994)
- CancelTaskById should not be blocked by unrelated streaming jobs (#62940)
- Recompute derived fields after replay and ALTER (#62936)
- Persist cdc_stream TVF offset across FE checkpoint (#62902)
- Avoid mutating shared variant columns (#64092)
- Select txn insert backend from current cluster (#63634)
- Fix host not match if start fe in metadata_failure_recovery (#62748)
- Deduplicate pending one-shot warm up jobs (#62384)
Lakehouse
- Mapping iceberg varint type to unsupported type (#64331)
- Fix iceberg write special partition error. (#64225)
- Fix can't get migrated Iceberg tables format type (#64134)
- Use object store path for data location (#64028)
- Reject iceberg COW table row-level DML (#63950)
- Fix iceberg v3 row lineage count distinct error result (#63826)
- Fix DCHECK in LocalExchangeSharedState::sub_total_mem_usage (#63742)
- Add missing Iceberg field IDs for position delete files. (#63483)
- Fill Hive meta cache when loading row count for queries (#63470)
- Fix iceberg sink writer with spill report error (#62899)
- Rest & S3Table Support Iam-role (#60498)
Security & Authentication
- Bound length in MysqlProto.readLenEncodedString (#63604)
- Include HDFS connection in file handle cache key (#63516)
- Fix arrow flight client ip auth (#63506)
- Preserve narrowing datetimev2 casts in simplify in predicate (#63343)
- Fix arrow::Status inline static empty msg core (#63191)
- Support multi-root auth plugin loading and normalize OIDC access token auth (#62159)
- Improve masking of user's password for ALTER USER and CREATE USER commands in audit logs (#62141)
- Improve LDAP authentication resiliency and diagnostics (#61673)
- Integrate authentication chain and simplify fallback config (#61362)