CREATE SYNC JOB
Descriptionβ
The data synchronization (Sync Job) function supports users to submit a resident data synchronization job, and incrementally synchronizes the CDC (Change Data Capture) of the user's data update operation in the Mysql database by reading the Binlog log from the specified remote address. Features.
Currently, the data synchronization job only supports connecting to Canal, obtaining the parsed Binlog data from the Canal Server and importing it into Doris.
Users can view the data synchronization job status through SHOW SYNC JOB.
grammar:
CREATE SYNC [db.]job_name
(
channel_desc,
channel_desc
...
)
binlog_desc
-
job_name
The synchronization job name is the unique identifier of the job in the current database. Only one job with the same
job_name
can be running. -
channel_desc
The data channel under the job is used to describe the mapping relationship between the mysql source table and the doris target table.
grammar:
FROM mysql_db.src_tbl INTO des_tbl
[columns_mapping]-
mysql_db.src_tbl
Specify the database and source table on the mysql side.
-
des_tbl
Specify the target table on the doris side. Only unique tables are supported, and the batch delete function of the table needs to be enabled (see the 'batch delete function' of help alter table for how to enable it).
-
column_mapping
Specifies the mapping relationship between the columns of the mysql source table and the doris target table. If not specified, FE will default the columns of the source table and the target table to one-to-one correspondence in order.
The form col_name = expr is not supported for columns.
Example:
Suppose the target table column is (k1, k2, v1),
Change the order of columns k1 and k2
(k2, k1, v1)
Ignore the fourth column of the source data
(k2, k1, v1, dummy_column)
-
-
binlog_desc
Used to describe the remote data source, currently only one canal is supported.
grammar:
FROM BINLOG
(
"key1" = "value1",
"key2" = "value2"
)-
The properties corresponding to the Canal data source, prefixed with
canal.
- canal.server.ip: address of canal server
- canal.server.port: the port of the canal server
- canal.destination: the identity of the instance
- canal.batchSize: The maximum batch size obtained, the default is 8192
- canal.username: username of instance
- canal.password: the password of the instance
- canal.debug: optional, when set to true, the batch and details of each row of data will be printed out
-
Exampleβ
-
Simply create a data synchronization job named
job1
fortest_tbl
oftest_db
, connect to the local Canal server, corresponding to the Mysql source tablemysql_db1.tbl1
.CREATE SYNC `test_db`.`job1`
(
FROM `mysql_db1`.`tbl1` INTO `test_tbl`
)
FROM BINLOG
(
"type" = "canal",
"canal.server.ip" = "127.0.0.1",
"canal.server.port" = "11111",
"canal.destination" = "example",
"canal.username" = "",
"canal.password" = ""
); -
Create a data synchronization job named
job1
for multiple tables oftest_db
, corresponding to multiple Mysql source tables one-to-one, and explicitly specify the column mapping.CREATE SYNC `test_db`.`job1`
(
FROM `mysql_db`.`t1` INTO `test1` (k1, k2, v1) ,
FROM `mysql_db`.`t2` INTO `test2` (k3, k4, v2)
)
FROM BINLOG
(
"type" = "canal",
"canal.server.ip" = "xx.xxx.xxx.xx",
"canal.server.port" = "12111",
"canal.destination" = "example",
"canal.username" = "username",
"canal.password" = "password"
);
Keywordsβ
CREATE, SYNC, JOB