Deploy Doris Cluster
Deploying a functional decoupled storage and compute Doris cluster on Kubernetes involves four main steps:
- Preparation – Primarily, install a FoundationDB cluster.
- Deploying the Doris Operator.
- Deploying the compute-storage decoupled cluster.
- Creating the Storage Backend.
Step 1: Preparation
Before deploying a decoupled cluster on Kubernetes, it is essential to have FoundationDB deployed in advance.
- (Preferred) Direct Deployment on Machines: Ensure that the machine where FoundationDB is installed is accessible by services running within the Kubernetes cluster. For direct machine deployments, please refer to the Preparation Phase in the decoupled deployment documentation.
- Deployment on Kubernetes: For deploying FoundationDB on Kubernetes, please refer to Deploying FoundationDB on Kubernetes.
Step 2: Deploying the Doris Operator
- Create the resource definitions:
If a non-decoupled cluster has already been deployed, use the following command to create the CRD definitions:
kubectl create -f https://raw.githubusercontent.com/apache/doris-operator/master/config/crd/bases/crds.yaml
kubectl create -f https://raw.githubusercontent.com/apache/doris-operator/master/config/crd/bases/disaggregated.cluster.doris.com_dorisdisaggregatedclusters.yaml
- Deploy the Doris Operator and its associated RBAC rules:
After deployment, verify the status of the Operator Pod using:
kubectl apply -f https://raw.githubusercontent.com/apache/doris-operator/master/config/operator/disaggregated-operator.yaml
kubectl -n doris get pods
NAME READY STATUS RESTARTS AGE
doris-operator-6b97df65c4-xwvw8 1/1 Running 0 19s
Step 3: Deploy the compute-storage decoupled cluster
-
Download the Deployment Sample:
curl -O https://raw.githubusercontent.com/apache/doris-operator/master/doc/examples/disaggregated/cluster/ddc-sample.yaml
-
Configure FoundationDB access information. The compute-storage decoupled version of Doris uses FoundationDB to store metadata. The access details for FoundationDB can be provided in the DorisDisaggregatedCluster under
spec.metaService.fdb
in one of two ways: by directly specifying the access address or by using a ConfigMap that includes the access information.-
Direct Access Address Configuration If FoundationDB is deployed outside of Kubernetes, you can specify its access address directly:
spec:
metaService:
fdb:
address: ${fdbAddress}Here, ${fdbAddress} refers to the client access address for FoundationDB. On Linux VMs, this is typically stored in
/etc/foundationdb/fdb.cluster
. For more details, refer to the FoundationDB cluster file documentation. -
Configuring via a ConfigMap Containing Access Information If FoundationDB is deployed using the fdb-kubernetes-operator, the operator will generate a specific ConfigMap containing the access information within the deployment namespace. The generated ConfigMap's name is the FoundationDB resource name with the suffix “-config”. After obtaining the ConfigMap's name and namespace, configure the DorisDisaggregatedCluster resource as follows:
spec:
metaService:
fdb:
configMapNamespaceName:
name: ${foundationdbConfigMapName}
namespace: ${namespace}Here, {foundationdbConfigMapName} is the name of the ConfigMap generated by the fdb-kubernetes-operator, and {namespace} is the namespace where the ConfigMap resides.
-
-
Configure the DorisDisaggregatedCluster Resource Based on the decoupled deployment documentation, configure:
- The metadata service as detailed in the (metaService configuration).
- The FE cluster specifications (FE cluster configuration).
- The compute groups (compute group configuration).
After completing the configuration, deploy the resources with the following command:
kubectl apply -f ddc-sample.yaml
Once the resources are applied, wait for the cluster to be fully established. The expected output of the following command is:
kubectl get ddc
NAME CLUSTERHEALTH FEPHASE CGCOUNT CGAVAILABLECOUNT CGFULLAVAILABLECOUNT
test-disaggregated-cluster green Ready 2 2 2
Step 4: Creating the Remote Storage Backend
After the cluster has successfully started, configure an available object storage as the persistent storage backend (referred to as a Vault in Doris) using SQL.
-
Obtain the FE Service Access Address After the cluster is deployed, you can view the services exposed by the Doris Operator with the following command:
kubectl get svc
Example output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
test-disaggregated-cluster-fe ClusterIP 10.96.147.97 <none> 8030/TCP,9020/TCP,9030/TCP,9010/TCP 15m
test-disaggregated-cluster-fe-internal ClusterIP None <none> 9030/TCP 15m
test-disaggregated-cluster-ms ClusterIP 10.96.169.8 <none> 5000/TCP 15m
test-disaggregated-cluster-cg1 ClusterIP 10.96.47.90 <none> 9060/TCP,8040/TCP,9050/TCP,8060/TCP 14m
test-disaggregated-cluster-cg2 ClusterIP 10.96.50.199 <none> 9060/TCP,8040/TCP,9050/TCP,8060/TCP 14mThe Service without the “-internal” suffix is intended for external access.
-
Connect Using the MySQL Client Within the Kubernetes cluster, create a Pod containing the MySQL Client and enter the Pod:
kubectl run mysql-client --image=mysql:5.7 -it --rm --restart=Never -- /bin/bash
Within the Pod, connect to the Doris cluster directly using the Service name:、
mysql -uroot -P9030 -h test-disaggregated-cluster-fe
-
Create the Storage Backend(Vault) Create an object storage backend supporting the S3 protocol as the Vault using SQL. For example:
CREATE STORAGE VAULT IF NOT EXISTS s3_vault
PROPERTIES (
"type"="S3",
"s3.endpoint" = "oss-cn-beijing.aliyuncs.com",
"s3.region" = "bj",
"s3.bucket" = "bucket",
"s3.root.path" = "big/data/prefix",
"s3.access_key" = "your-ak",
"s3.secret_key" = "your-sk",
"provider" = "OSS"
);For instructions on creating other storage backends and detailed explanations of each field, please refer to the Managing Storage Vault section of the decoupled deployment documentation. Set the Default Storage Vault.
SET {vaultName} AS DEFAULT STORAGE VAULT;
Here, {vaultName} is the name of the Vault you wish to use, for example, s3_vault as created in the example above.