The Memory Tracker records the memory usage of the Doris BE process, including the memory used in the life cycle of tasks such as query, import, Compaction, and Schema Change, as well as various caches for memory control and analysis.
Each query, import and other tasks in the system will create its own Memory Tracker when it is initialized, and put the Memory Tracker into TLS (Thread Local Storage) during execution, and each memory application and release of the BE process will be in the Mem Hook Consume the Memory Tracker in the middle, and display it after the final summary.
For detailed design and implementation, please refer to: https://cwiki.apache.org/confluence/display/DORIS/DSIP-002%3A+Refactor+memory+tracker+on+BE https://shimo.im/docs/DT6JXDRkdTvdyV3G
The real-time memory statistics results can be viewed through Doris BE's Web page http://ip:http_port/mem_tracker.
For the memory statistics results of historical queries, you can view the
peakMemoryBytes of each query in
fe/log/fe.audit.log, or search
Deregister query/load memory tracker, queryId in
be/log/be.INFO `View memory peaks per query on a single BE.
- Type: Divide the memory used by Doris BE into the following categories
- process: The total memory of the process, the sum of all other types.
- global: Global Memory Tracker with the same life cycle and process, such as each Cache, Tablet Manager, Storage Engine, etc.
- query: the in-memory sum of all queries.
- load: Sum of all imported memory.
- tc/jemalloc_cache: The general memory allocator TCMalloc or Jemalloc cache, you can view the original profile of the memory allocator in real time at http://ip:http_port/memz.
- compaction, schema_change, consistency, batch_load, clone: corresponding to the memory sum of all Compaction, Schema Change, Consistency, Batch Load, and Clone tasks respectively.
- Current Consumption(Bytes): current memory value, unit B.
- Current Consumption(Normalize): .G.M.K formatted output of the current memory value.
- Peak Consumption (Bytes): The peak value of the memory after the BE process is started, the unit is B, and it will be reset after the BE restarts.
- Peak Consumption(Normalize): The .G.M.K formatted output of the memory peak value after the BE process starts, and resets after the BE restarts.
- Label: Memory Tracker name
- Parent Label: It is used to indicate the parent-child relationship between two Memory Trackers. The memory recorded by the Child Tracker is a subset of the Parent Tracker. There may be intersections between the memories recorded by different Trackers with the same Parent.
Orphan: Tracker consumed by default. Memory that does not specify a tracker will be recorded in Orphan by default. In addition to the Child Tracker subdivided below, Orphan also includes some memory that is inconvenient to accurately subdivide and count, including BRPC.
- LoadChannelMgr: The sum of the memory of all imported Load Channel stages, used to write the scanned data to the Segment file on disk, a subset of Orphan.
- StorageEngine:, the memory consumed by the storage engine during loading the data directory, a subset of Orphan.
- SegCompaction: The memory sum of all SegCompaction tasks, a subset of Orphan.
- SegmentMeta: memory use by segment meta data such as footer or index page, a subset of Orphan.
- TabletManager: The memory consumed by the storage engine get, add, and delte Tablet, a subset of Orphan.
- BufferAllocator: Only used for memory multiplexing in the non-vectorized Partitioned Agg process, a subset of Orphan.
DataPageCache: Used to cache data Pages to speed up Scan.
IndexPageCache: The index used to cache the data Page, used to speed up Scan.
SegmentCache: Used to cache opened Segments, such as index information.
DiskIO: Used to cache Disk IO data, only used in non-vectorization.
ChunkAllocator: Used to cache power-of-2 memory blocks, and reuse memory at the application layer.
LastestSuccessChannelCache: Used to cache the LoadChannel of the import receiver.
DeleteBitmap AggCache: Gets aggregated delete_bitmap on rowset_id and version.
- Limit: The upper limit of memory used by a single query,
show session variablesto view and modify
- Label: The label naming rule of the Tracker for a single query is
- Parent Label: Parent is the Tracker record of
Query#Id=xxxto query the memory used by different operators during execution.
- Limit: The import is divided into two stages: Fragment Scan and Load Channel to write Segment to disk. The upper memory limit of the Scan phase can be viewed and modified through
show session variables; the segment write disk phase does not have a separate memory upper limit for each import, but the total upper limit of all imports, corresponding to
- Label: The label naming rule of a single import Scan stage Tracker is
Load#Id=xxx; the Label naming rule of a single import Segment write disk stage Tracker is
- Parent Label: Parent is the Tracker of
Load#Id=xxx, which records the memory used by different operators during the import Scan stage; Parent is the Tracker of
LoadChannelMgrTrackerSet, which records the Insert and The memory used by the Flush disk process is associated with the last
loadIDof the Label to write to the disk stage Tracker of the Segment.