Skip to main content

Continuous Load Overview

Overview

Doris supports continuously loading data from multiple data sources into Doris tables via Streaming Job. After submitting a Job, Doris continuously runs the import job, reading data from the source in real time and writing it into Doris tables.

Continuous Load supports the following data sources and import modes:

Data SourceSingle-table ImportMulti-table ImportPrerequisites
MySQLMySQL Single-tableMySQL Multi-tableAmazon RDS MySQL · Amazon Aurora MySQL
PostgreSQLPostgreSQL Single-tablePostgreSQL Multi-tableAmazon RDS PostgreSQL · Amazon Aurora PostgreSQL
S3S3 Continuous Load--
tip
  • Single-table Import: Uses CDC Stream TVF or S3 TVF to continuously load data into a specific Doris table, supporting flexible column mapping and data transformation.
  • Multi-table Import: Uses native multi-table CDC capability to continuously synchronize full and incremental data from multiple source tables into Doris, automatically creating downstream tables on first sync.

Common Operations

Check Import Status

select * from jobs("type"="insert") where ExecuteType = "STREAMING";
ColumnDescription
IDJob ID
NAMEJob name
DefinerJob definer
ExecuteTypeJob type: ONE_TIME/RECURRING/STREAMING/MANUAL
RecurringStrategyRecurring strategy, empty for Streaming
StatusJob status
ExecuteSqlJob's Insert SQL statement
CreateTimeJob creation time
SucceedTaskCountNumber of successful tasks
FailedTaskCountNumber of failed tasks
CanceledTaskCountNumber of canceled tasks
CommentJob comment
PropertiesJob properties
CurrentOffsetCurrent offset, only for Streaming jobs
EndOffsetMax end offset from source, only for Streaming jobs
LoadStatisticJob statistics
ErrorMsgJob error message
JobRuntimeMsgJob runtime info

Check Task Status

select * from tasks("type"="insert") where jobId='<job_id>';
ColumnDescription
TaskIdTask ID
JobIDJob ID
JobNameJob name
LabelTask label
StatusTask status
ErrorMsgTask error message
CreateTimeTask creation time
StartTimeTask start time
FinishTimeTask finish time
LoadStatisticTask statistics
UserTask executor
RunningOffsetCurrent offset, only for Streaming jobs

Pause Import Job

PAUSE JOB WHERE jobname = <job_name>;

Resume Import Job

RESUME JOB WHERE jobName = <job_name>;

Delete Import Job

DROP JOB WHERE jobName = <job_name>;

Common Parameters

FE Configuration Parameters

ParameterDefaultDescription
max_streaming_job_num1024Maximum number of Streaming jobs
job_streaming_task_exec_thread_num10Number of threads for StreamingTask
max_streaming_task_show_count100Max number of StreamingTask records in memory

General Job Import Parameters

ParameterDefaultDescription
max_interval10sIdle scheduling interval when no new data