Skip to main content

MAP | Complex Types

MAP<K, V> represents a MAP whose elements are of type K, V, and is commonly used to store semi-structured key-value pair data. See MAP data type for details.

This document focuses on how to write data into MAP-type columns, and provides end-to-end examples for two common data sources:

  • CSV format load: suitable when the upstream data is delimited text (watch out for conflicts between the column delimiter and commas inside the MAP).
  • JSON format load: suitable when the upstream data is a JSON array or JSON Lines.

Both load methods use the Stream Load interface. The general steps are:

  1. Prepare the data file.
  2. Create the table in the database.
  3. Call Stream Load with curl to load the data.
  4. Query the results to verify.

CSV format load

Step 1: Prepare the data

Create the following CSV file test_map.csv. The column delimiter uses | instead of a comma, so that it does not conflict with the commas inside the MAP.

1|{"Emily":101,"age":25}
2|{"Benjamin":102}
3|{}
4|null

The meaning of each line in the file is as follows:

LineMeaning
1|{"Emily":101,"age":25}A MAP containing two key-value pairs
2|{"Benjamin":102}A MAP containing one key-value pair
3|{}An empty MAP
4|nullA NULL value

Step 2: Create the table in the database

CREATE TABLE map_test (
id INT NOT NULL,
c_map MAP<STRING, INT> NULL
)
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);

Step 3: Load the data

Load the CSV file with Stream Load. The meanings of the key headers are listed in the table below.

curl --location-trusted \
-u "root":"" \
-H "column_separator:|" \
-H "columns: id, c_map" \
-T "test_map.csv" \
http://localhost:8040/api/testdb/map_test/_stream_load
HeaderPurpose
`column_separator:`
columns: id, c_mapDeclares the mapping between file columns and table columns

Step 4: Verify the loaded data

mysql> SELECT * FROM map_test;
+------+-------------------------+
| id | c_map |
+------+-------------------------+
| 1 | {"Emily":101, "age":25} |
| 2 | {"Benjamin":102} |
| 3 | {} |
| 4 | NULL |
+------+-------------------------+
4 rows in set (0.01 sec)

JSON format load

Step 1: Prepare the data

Create the following JSON file test_map.json. The whole file is a JSON array, and each element corresponds to one row of data.

[
{"id":1, "c_map":{"Emily":101, "age":25}},
{"id":2, "c_map":{"Benjamin":102}},
{"id":3, "c_map":{}},
{"id":4, "c_map":null}
]

Step 2: Create the table in the database

CREATE TABLE map_test (
id INT NOT NULL,
c_map MAP<STRING, INT> NULL
)
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);

Step 3: Load the data

Load the JSON file with Stream Load. The meanings of the key headers are listed in the table below.

curl --location-trusted \
-u "root":"" \
-H "format:json" \
-H "columns: id, c_map" \
-H "strip_outer_array:true" \
-T "test_map.json" \
http://localhost:8040/api/testdb/map_test/_stream_load
HeaderPurpose
format:jsonDeclares the data format as JSON
strip_outer_array:trueWhen the input is a JSON array, parses each array element as one row
columns: id, c_mapDeclares the mapping between JSON fields and table columns

Step 4: Verify the loaded data

mysql> SELECT * FROM map_test;
+------+-------------------------+
| id | c_map |
+------+-------------------------+
| 1 | {"Emily":101, "age":25} |
| 2 | {"Benjamin":102} |
| 3 | {} |
| 4 | NULL |
+------+-------------------------+
4 rows in set (0.01 sec)

FAQ

Q1: Why not use a comma as the column delimiter in CSV?

The MAP literal itself uses commas to separate multiple key-value pairs (for example, {"Emily":101,"age":25}). Using a comma as the column delimiter as well would cause parsing ambiguity. It is recommended to use characters that do not appear inside the MAP content, such as | or \t.

Q2: Why is strip_outer_array:true needed when loading a JSON array?

When the outermost layer of the JSON file is an array (as in the example in this document), you must set strip_outer_array:true so that Doris treats each element in the array as one row of data. If the file is already JSON Lines (one object per line), this parameter is not required.

Q3: How do you represent an empty MAP versus NULL?

  • Empty MAP: write as {} in CSV, and as {} in JSON.
  • NULL value: write as null in CSV, and as null in JSON. The target column must allow NULL.