Skip to main content

Alibaba Cloud MaxCompute

MaxCompute is an enterprise-level SaaS (Software as a Service) cloud data warehouse on Alibaba Cloud.

What is MaxCompute

Connect to MaxCompute​

Example​

-- 1. Create Catalog.
CREATE CATALOG mc PROPERTIES (
"type" = "max_compute",
"mc.default.project" = "xxx",
"mc.access_key" = "xxxx",
"mc.secret_key" = "xxx",
"mc.endpoint" = "http://service.cn-beijing-vpc.MaxCompute.aliyun-inc.com/api"
);

-- 2. Switch to the newly created Catalog.
SWITCH mc;

-- The following steps are the same as using Mysql.

-- 3. View all databases under this Catalog.
SHOW DATABASES;

-- 4. Use a database. Here, xxx is any database from the results shown in step 3.
USE xxx;

-- 5. View all tables under this database.
SHOW TABLES;

-- 6. Perform SQL queries.
select * from tb limit 10;

Basic properties of creating Catalog​

ParameterDescription
typeFixed as max_compute.
mc.default.projectThe name of the MaxCompute project you want to access. It can be created and managed in MaxCompute project list.
mc.access_keyAccessKey.It can be created and managed in Alibaba Cloud console.
mc.secret_keySecretKey.It can be created and managed in Alibaba Cloud console.
mc.endpointThe region where MaxCompute is enabled. Please refer to How to obtain Endpoint and Quota below for configuration.

Optional properties of creating Catalog​

ParameterDescriptionDescription
mc.quotapay-as-you-goQuota name. Please refer to How to obtain Endpoint and Quota below for configuration.
mc.split_strategybyte_sizeSet the split division method. It can be set to divide by byte size byte_size or divide by row count row_count.
mc.split_byte_size268435456The file size read by each split, in bytes. The default is 256MB. It takes effect only when "mc.split_strategy" = "byte_size".
mc.split_row_count1048576The number of rows read by each split. It takes effect only when "mc.split_strategy" = "row_count".

Column type mapping​

MaxComputeDorisRemarks
TINYINTTINYINT
TINYINTTINYINT
SMALLINTSMALLINT
INTINT
BIGINTBIGINT
BINARYNot supported
FLOATFLOAT
DOUBLEDOUBLE
DECIMAL(precision,scale)DECIMAL(precision,scale)
VARCHAR(n)VARCHAR(n)
CHAR(n)CHAR(n)
STRINGSTRING
DATEDATE
DATETIMEDATETIME(3)You can specify the time zone by SET [global] time_zone = 'Asia/Shanghai'.
TIMESTAMPNot supported
TIMESTAMP_NTZDATETIME(6)The precision of TIMESTAMP_NTZ in MaxCompute is 9. The maximum precision of DATETIME in Doris is only 6. Therefore, when reading data, the extra parts will be directly truncated.
BOOLEANBOOLEAN
ARRAYARRAY
MAPMAP
STRUCTSTRUCT
JSONNot supported

Usage notes​

  1. Since version 2.1.7, The MaxCompute Catalog is developed based on the Open Storage SDK .
  2. The use of the Open Storage SDK has certain limitations. Please refer to the Usage limitations section in this document .
  3. The Project in MaxCompute is equivalent to the DataBase in Doris.

How to obtain Endpoint and Quota​

  1. if you use the dedicated resource group of data transmission service , please refer to the Use exclusive data service resource groups section in this document, and in 2. Authorization, enable the corresponding permissions. In the Quota (Quota) management list, view and copy the corresponding QuotaName, and specify "mc.quota" = "QuotaName". At this time, you can choose VPC/public network to access MaxCompute, but the bandwidth through VPC is guaranteed, and the public network bandwidth resources are small.

  2. If you use pay-as-you-go, please refer to the Using open storage (pay-as-you-go) section in this document to enable the open storage (Storage API) switch and grant permissions to the users corresponding to Ak and SK. At this time, your mc.quota is the default value pay-as-you-go, and you do not need to specify this value additionally. At this time, you can only use VPC to access MaxCompute.

  3. Through steps 1/2, you already know how to access MaxCompute. Now, you need to configure mc.endpoint according to the Endpoints in different regions in Alibaba Cloud Endpoints document. Users that access through VPC need to configure mc.endpoint according to the VPC endpoint column in the Endpoints in different regions(VPC). Users that access through the public network can choose the Classic network endpoint column in the Endpoints in different regions(internal network for connecting cloud products) or the Public endpoint column in the Endpoints in different regions(Internet) to configure mc.endpoint.