Data import things and atomicity
All import operations in Doris have atomicity guarantees, that is, the data in an import job either all succeed or all fail. It will not happen that only part of the data is imported successfully.
In BROKER LOAD we can also implement atomic import of multiple tables .
For the materialized view attached to the table, atomicity and consistency with the base table are also guaranteed.
Label mechanism
Doris's import job can set a Label. This Label is usually a user-defined string with certain business logic attributes.
The main function of Label is to uniquely identify an import task, and to ensure that the same Label will only be successfully imported once.
The Label mechanism can ensure that the imported data is not lost or heavy. If the upstream data source can guarantee the At-Least-Once semantics, with the Doris Label mechanism, the Exactly-Once semantics can be guaranteed.
Label is unique under a database. The retention period for labels is 3 days by default. That is, after 3 days, the completed Label will be automatically cleaned up, and then the Label can be reused.
Best Practices
Labels are usually formatted as business logic + time
. Such as my_business1_20220330_125000
.
This Label is usually used to represent: a batch of data generated by the business my_business1
at 2022-03-30 12:50:00
. Through this Label setting, the business can query the import task status through the Label to clearly know whether the batch data has been imported successfully at this point in time. If unsuccessful, you can continue to retry the import using this Label