Skip to main content

local file

Doris provides multiple ways to load data from local sources:

1. Stream Load

Load local files or data streams into Doris via HTTP protocol. Supports CSV, JSON, Parquet, and ORC formats. For more information, refer to the Stream Load documentation.

2. Streamloader Tool

The Streamloader tool is a dedicated client tool for loading data into the Doris database, based on Stream Load. It can provide multi-file and concurrent load capabilities, reducing the time required for loading large volumes of data. For more documentation, refer to Streamloader.

3. MySQL Load

Doris is compatible with MySQL protocol and supports using the standard LOAD DATA syntax to load local files, suitable for loading CSV files.

Using Stream Load to Load Data

Step 1: Prepare Data

Create a CSV file named streamload_example.csv with the following content:

1,Emily,25
2,Benjamin,35
3,Olivia,28
4,Alexander,60
5,Ava,17
6,William,69
7,Sophia,32
8,James,64
9,Emma,37
10,Liam,64

Step 2: Create Table

Create a table in Doris with the following syntax:

CREATE TABLE testdb.test_streamload(
user_id BIGINT NOT NULL COMMENT "User ID",
name VARCHAR(20) COMMENT "User Name",
age INT COMMENT "User Age"
)
DUPLICATE KEY(user_id)
DISTRIBUTED BY HASH(user_id) BUCKETS 10;

Step 3: Load Data

Submit a Stream Load job using curl:

curl --location-trusted -u <doris_user>:<doris_password> \
-H "column_separator:," \
-H "columns:user_id,name,age" \
-T streamload_example.csv \
-XPUT http://<fe_ip>:<fe_http_port>/api/testdb/test_streamload/_stream_load

Example of load result:

{
"TxnId": 3,
"Status": "Success",
"NumberTotalRows": 10,
"NumberLoadedRows": 10
}

Step 4: Check Loaded Data

mysql> SELECT COUNT(*) FROM testdb.test_streamload;
+----------+
| count(*) |
+----------+
| 10 |
+----------+

Using Streamloader Tool to Load Data

Step 1: Prepare Data

Create a CSV file named streamloader_example.csv with the same content as above.

Step 2: Create Table

Create the table in Doris with the same syntax as above.

Step 3: Load Data

Use the Streamloader tool to load data:

doris-streamloader --source_file="streamloader_example.csv" --url="http://localhost:8330" --header="column_separator:," --db="testdb" --table="test_streamload"

Example of load result:

Load Result: {
"Status": "Success",
"TotalRows": 10,
"LoadedRows": 10
}

Step 4: Check Loaded Data

mysql> SELECT COUNT(*) FROM testdb.test_streamload;
+----------+
| count(*) |
+----------+
| 10 |
+----------+

Using MySQL Load to Load Data

Step 1: Prepare Data

Create a file named client_local.csv with the following sample data:

1,10
2,20
3,30
4,40
5,50
6,60

Step 2: Connect to MySQL Client

mysql --local-infile -h <fe_ip> -P <fe_query_port> -u root -D testdb

Step 3: Load Data

Execute the MySQL Load command:

LOAD DATA LOCAL
INFILE 'client_local.csv'
INTO TABLE testdb.t1
COLUMNS TERMINATED BY ','
LINES TERMINATED BY '\n';

Step 4: Check Loaded Data

If the load is successful, the result will be displayed as follows:

Query OK, 6 row affected (0.17 sec)
Records: 6 Deleted: 0 Skipped: 0 Warnings: 0