Continuous Load Overview
Overview
Doris supports continuously loading data from multiple data sources into Doris tables via Streaming Job. After submitting a Job, Doris continuously runs the import job, reading data from the source in real time and writing it into Doris tables.
Continuous Load supports the following data sources and import modes:
| Data Source | Single-table Import | Multi-table Import | Prerequisites |
|---|---|---|---|
| MySQL | MySQL Single-table | MySQL Multi-table | Amazon RDS MySQL · Amazon Aurora MySQL |
| PostgreSQL | PostgreSQL Single-table | PostgreSQL Multi-table | Amazon RDS PostgreSQL · Amazon Aurora PostgreSQL |
| S3 | S3 Continuous Load | - | - |
tip
- Single-table Import: Uses CDC Stream TVF or S3 TVF to continuously load data into a specific Doris table, supporting flexible column mapping and data transformation.
- Multi-table Import: Uses native multi-table CDC capability to continuously synchronize full and incremental data from multiple source tables into Doris, automatically creating downstream tables on first sync.
Common Operations
Check Import Status
select * from jobs("type"="insert") where ExecuteType = "STREAMING";
| Column | Description |
|---|---|
| ID | Job ID |
| NAME | Job name |
| Definer | Job definer |
| ExecuteType | Job type: ONE_TIME/RECURRING/STREAMING/MANUAL |
| RecurringStrategy | Recurring strategy, empty for Streaming |
| Status | Job status |
| ExecuteSql | Job's Insert SQL statement |
| CreateTime | Job creation time |
| SucceedTaskCount | Number of successful tasks |
| FailedTaskCount | Number of failed tasks |
| CanceledTaskCount | Number of canceled tasks |
| Comment | Job comment |
| Properties | Job properties |
| CurrentOffset | Current offset, only for Streaming jobs |
| EndOffset | Max end offset from source, only for Streaming jobs |
| LoadStatistic | Job statistics |
| ErrorMsg | Job error message |
| JobRuntimeMsg | Job runtime info |
Check Task Status
select * from tasks("type"="insert") where jobId='<job_id>';
| Column | Description |
|---|---|
| TaskId | Task ID |
| JobID | Job ID |
| JobName | Job name |
| Label | Task label |
| Status | Task status |
| ErrorMsg | Task error message |
| CreateTime | Task creation time |
| StartTime | Task start time |
| FinishTime | Task finish time |
| LoadStatistic | Task statistics |
| User | Task executor |
| RunningOffset | Current offset, only for Streaming jobs |
Pause Import Job
PAUSE JOB WHERE jobname = <job_name>;
Resume Import Job
RESUME JOB WHERE jobName = <job_name>;
Delete Import Job
DROP JOB WHERE jobName = <job_name>;
Common Parameters
FE Configuration Parameters
| Parameter | Default | Description |
|---|---|---|
| max_streaming_job_num | 1024 | Maximum number of Streaming jobs |
| job_streaming_task_exec_thread_num | 10 | Number of threads for StreamingTask |
| max_streaming_task_show_count | 100 | Max number of StreamingTask records in memory |
General Job Import Parameters
| Parameter | Default | Description |
|---|---|---|
| max_interval | 10s | Idle scheduling interval when no new data |