MAP | Complex Types

MAP<K, V> represents a MAP whose elements are of type K, V, and is commonly used to store semi-structured key-value pair data. See MAP data type for details.

This document focuses on how to write data into MAP-type columns, and provides end-to-end examples for two common data sources:

CSV format load: suitable when the upstream data is delimited text (watch out for conflicts between the column delimiter and commas inside the MAP).
JSON format load: suitable when the upstream data is a JSON array or JSON Lines.

Both load methods use the Stream Load interface. The general steps are:

Prepare the data file.
Create the table in the database.
Call Stream Load with curl to load the data.
Query the results to verify.

CSV format load

Step 1: Prepare the data

Create the following CSV file test_map.csv. The column delimiter uses | instead of a comma, so that it does not conflict with the commas inside the MAP.

{"Emily":101,"age":25}
{"Benjamin":102}
{}
null

The meaning of each line in the file is as follows:

Line	Meaning
`1\|{"Emily":101,"age":25}`	A MAP containing two key-value pairs
`2\|{"Benjamin":102}`	A MAP containing one key-value pair
`3\|{}`	An empty MAP
`4\|null`	A NULL value

Step 2: Create the table in the database

CREATE TABLE map_test (
    id       INT                 NOT NULL,
    c_map    MAP<STRING, INT>    NULL
)
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES (
    "replication_allocation" = "tag.location.default: 1"
);

Step 3: Load the data

Load the CSV file with Stream Load. The meanings of the key headers are listed in the table below.

curl --location-trusted \
        -u "root":"" \
        -H "column_separator:|" \
        -H "columns: id, c_map" \
        -T "test_map.csv" \
        http://localhost:8040/api/testdb/map_test/_stream_load

Header	Purpose
`column_separator:	`
`columns: id, c_map`	Declares the mapping between file columns and table columns

Step 4: Verify the loaded data

mysql> SELECT * FROM map_test;
+------+-------------------------+
| id   | c_map                   |
+------+-------------------------+
|    1 | {"Emily":101, "age":25} |
|    2 | {"Benjamin":102}        |
|    3 | {}                      |
|    4 | NULL                    |
+------+-------------------------+
4 rows in set (0.01 sec)

JSON format load

Step 1: Prepare the data

Create the following JSON file test_map.json. The whole file is a JSON array, and each element corresponds to one row of data.

[
    {"id":1, "c_map":{"Emily":101, "age":25}},
    {"id":2, "c_map":{"Benjamin":102}},
    {"id":3, "c_map":{}},
    {"id":4, "c_map":null}
]

Step 2: Create the table in the database

CREATE TABLE map_test (
    id       INT                 NOT NULL,
    c_map    MAP<STRING, INT>    NULL
)
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES (
    "replication_allocation" = "tag.location.default: 1"
);

Step 3: Load the data

Load the JSON file with Stream Load. The meanings of the key headers are listed in the table below.

curl --location-trusted \
        -u "root":"" \
        -H "format:json" \
        -H "columns: id, c_map" \
        -H "strip_outer_array:true" \
        -T "test_map.json" \
        http://localhost:8040/api/testdb/map_test/_stream_load

Header	Purpose
`format:json`	Declares the data format as JSON
`strip_outer_array:true`	When the input is a JSON array, parses each array element as one row
`columns: id, c_map`	Declares the mapping between JSON fields and table columns

Step 4: Verify the loaded data

mysql> SELECT * FROM map_test;
+------+-------------------------+
| id   | c_map                   |
+------+-------------------------+
|    1 | {"Emily":101, "age":25} |
|    2 | {"Benjamin":102}        |
|    3 | {}                      |
|    4 | NULL                    |
+------+-------------------------+
4 rows in set (0.01 sec)

FAQ

Q1: Why not use a comma as the column delimiter in CSV?

The MAP literal itself uses commas to separate multiple key-value pairs (for example, {"Emily":101,"age":25}). Using a comma as the column delimiter as well would cause parsing ambiguity. It is recommended to use characters that do not appear inside the MAP content, such as | or \t.

Q2: Why is strip_outer_array:true needed when loading a JSON array?

When the outermost layer of the JSON file is an array (as in the example in this document), you must set strip_outer_array:true so that Doris treats each element in the array as one row of data. If the file is already JSON Lines (one object per line), this parameter is not required.

Q3: How do you represent an empty MAP versus NULL?

Empty MAP: write as {} in CSV, and as {} in JSON.
NULL value: write as null in CSV, and as null in JSON. The target column must allow NULL.

CSV format load​

Step 1: Prepare the data​

Step 2: Create the table in the database​

Step 3: Load the data​

Step 4: Verify the loaded data​

JSON format load​

Step 1: Prepare the data​

Step 2: Create the table in the database​

Step 3: Load the data​

Step 4: Verify the loaded data​

FAQ​

CSV format load

Step 1: Prepare the data

Step 2: Create the table in the database

Step 3: Load the data

Step 4: Verify the loaded data

JSON format load

Step 1: Prepare the data

Step 2: Create the table in the database

Step 3: Load the data

Step 4: Verify the loaded data

FAQ