MAP | Complex Types
MAP<K, V> represents a MAP whose elements are of type K, V, and is commonly used to store semi-structured key-value pair data. See MAP data type for details.
This document focuses on how to write data into MAP-type columns, and provides end-to-end examples for two common data sources:
- CSV format load: suitable when the upstream data is delimited text (watch out for conflicts between the column delimiter and commas inside the MAP).
- JSON format load: suitable when the upstream data is a JSON array or JSON Lines.
Both load methods use the Stream Load interface. The general steps are:
- Prepare the data file.
- Create the table in the database.
- Call Stream Load with
curlto load the data. - Query the results to verify.
CSV format load
Step 1: Prepare the data
Create the following CSV file test_map.csv. The column delimiter uses | instead of a comma, so that it does not conflict with the commas inside the MAP.
1|{"Emily":101,"age":25}
2|{"Benjamin":102}
3|{}
4|null
The meaning of each line in the file is as follows:
| Line | Meaning |
|---|---|
1|{"Emily":101,"age":25} | A MAP containing two key-value pairs |
2|{"Benjamin":102} | A MAP containing one key-value pair |
3|{} | An empty MAP |
4|null | A NULL value |
Step 2: Create the table in the database
CREATE TABLE map_test (
id INT NOT NULL,
c_map MAP<STRING, INT> NULL
)
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);
Step 3: Load the data
Load the CSV file with Stream Load. The meanings of the key headers are listed in the table below.
curl --location-trusted \
-u "root":"" \
-H "column_separator:|" \
-H "columns: id, c_map" \
-T "test_map.csv" \
http://localhost:8040/api/testdb/map_test/_stream_load
| Header | Purpose |
|---|---|
| `column_separator: | ` |
columns: id, c_map | Declares the mapping between file columns and table columns |
Step 4: Verify the loaded data
mysql> SELECT * FROM map_test;
+------+-------------------------+
| id | c_map |
+------+-------------------------+
| 1 | {"Emily":101, "age":25} |
| 2 | {"Benjamin":102} |
| 3 | {} |
| 4 | NULL |
+------+-------------------------+
4 rows in set (0.01 sec)
JSON format load
Step 1: Prepare the data
Create the following JSON file test_map.json. The whole file is a JSON array, and each element corresponds to one row of data.
[
{"id":1, "c_map":{"Emily":101, "age":25}},
{"id":2, "c_map":{"Benjamin":102}},
{"id":3, "c_map":{}},
{"id":4, "c_map":null}
]
Step 2: Create the table in the database
CREATE TABLE map_test (
id INT NOT NULL,
c_map MAP<STRING, INT> NULL
)
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);
Step 3: Load the data
Load the JSON file with Stream Load. The meanings of the key headers are listed in the table below.
curl --location-trusted \
-u "root":"" \
-H "format:json" \
-H "columns: id, c_map" \
-H "strip_outer_array:true" \
-T "test_map.json" \
http://localhost:8040/api/testdb/map_test/_stream_load
| Header | Purpose |
|---|---|
format:json | Declares the data format as JSON |
strip_outer_array:true | When the input is a JSON array, parses each array element as one row |
columns: id, c_map | Declares the mapping between JSON fields and table columns |
Step 4: Verify the loaded data
mysql> SELECT * FROM map_test;
+------+-------------------------+
| id | c_map |
+------+-------------------------+
| 1 | {"Emily":101, "age":25} |
| 2 | {"Benjamin":102} |
| 3 | {} |
| 4 | NULL |
+------+-------------------------+
4 rows in set (0.01 sec)
FAQ
Q1: Why not use a comma as the column delimiter in CSV?
The MAP literal itself uses commas to separate multiple key-value pairs (for example, {"Emily":101,"age":25}). Using a comma as the column delimiter as well would cause parsing ambiguity. It is recommended to use characters that do not appear inside the MAP content, such as | or \t.
Q2: Why is strip_outer_array:true needed when loading a JSON array?
When the outermost layer of the JSON file is an array (as in the example in this document), you must set strip_outer_array:true so that Doris treats each element in the array as one row of data. If the file is already JSON Lines (one object per line), this parameter is not required.
Q3: How do you represent an empty MAP versus NULL?
- Empty MAP: write as
{}in CSV, and as{}in JSON. - NULL value: write as
nullin CSV, and asnullin JSON. The target column must allowNULL.