Skip to main content

Deploy Doris Cluster

Deploying a functional decoupled storage and compute Doris cluster on Kubernetes involves four main steps:

  1. Preparation – Primarily, install a FoundationDB cluster.
  2. Deploying the Doris Operator.
  3. Deploying the compute-storage decoupled cluster.
  4. Creating the Storage Backend.

Step 1: Preparation

Before deploying a decoupled cluster on Kubernetes, it is essential to have FoundationDB deployed in advance.

  • (Preferred) Direct Deployment on Machines: Ensure that the machine where FoundationDB is installed is accessible by services running within the Kubernetes cluster. For direct machine deployments, please refer to the Preparation Phase in the decoupled deployment documentation.
  • Deployment on Kubernetes: For deploying FoundationDB on Kubernetes, please refer to Deploying FoundationDB on Kubernetes.

Step 2: Deploying the Doris Operator

  1. Create the resource definitions:
    kubectl create -f https://raw.githubusercontent.com/apache/doris-operator/master/config/crd/bases/crds.yaml
    If a non-decoupled cluster has already been deployed, use the following command to create the CRD definitions:
    kubectl create -f https://raw.githubusercontent.com/apache/doris-operator/master/config/crd/bases/disaggregated.cluster.doris.com_dorisdisaggregatedclusters.yaml
  2. Deploy the Doris Operator and its associated RBAC rules:
    kubectl apply -f https://raw.githubusercontent.com/apache/doris-operator/master/config/operator/disaggregated-operator.yaml
    After deployment, verify the status of the Operator Pod using:
    kubectl -n doris get pods
    NAME READY STATUS RESTARTS AGE
    doris-operator-6b97df65c4-xwvw8 1/1 Running 0 19s

Step 3: Deploy the compute-storage decoupled cluster

  1. Download the Deployment Sample:

    curl -O https://raw.githubusercontent.com/apache/doris-operator/master/doc/examples/disaggregated/cluster/ddc-sample.yaml
  2. Configure FoundationDB access information. The compute-storage decoupled version of Doris uses FoundationDB to store metadata. The access details for FoundationDB can be provided in the DorisDisaggregatedCluster under spec.metaService.fdb in one of two ways: by directly specifying the access address or by using a ConfigMap that includes the access information.

    • Direct Access Address Configuration If FoundationDB is deployed outside of Kubernetes, you can specify its access address directly:

      spec:
      metaService:
      fdb:
      address: ${fdbAddress}

      Here, ${fdbAddress} refers to the client access address for FoundationDB. On Linux VMs, this is typically stored in /etc/foundationdb/fdb.cluster. For more details, refer to the FoundationDB cluster file documentation.

    • Configuring via a ConfigMap Containing Access Information If FoundationDB is deployed using the fdb-kubernetes-operator, the operator will generate a specific ConfigMap containing the access information within the deployment namespace. The generated ConfigMap's name is the FoundationDB resource name with the suffix “-config”. After obtaining the ConfigMap's name and namespace, configure the DorisDisaggregatedCluster resource as follows:

      spec:
      metaService:
      fdb:
      configMapNamespaceName:
      name: ${foundationdbConfigMapName}
      namespace: ${namespace}

      Here, {foundationdbConfigMapName} is the name of the ConfigMap generated by the fdb-kubernetes-operator, and {namespace} is the namespace where the ConfigMap resides.

  3. Configure the DorisDisaggregatedCluster Resource Based on the decoupled deployment documentation, configure:

    After completing the configuration, deploy the resources with the following command:

    kubectl apply -f ddc-sample.yaml

    Once the resources are applied, wait for the cluster to be fully established. The expected output of the following command is:

    kubectl get ddc
    NAME CLUSTERHEALTH FEPHASE CGCOUNT CGAVAILABLECOUNT CGFULLAVAILABLECOUNT
    test-disaggregated-cluster green Ready 2 2 2

Step 4: Creating the Remote Storage Backend

After the cluster has successfully started, configure an available object storage as the persistent storage backend (referred to as a Vault in Doris) using SQL.

  1. Obtain the FE Service Access Address After the cluster is deployed, you can view the services exposed by the Doris Operator with the following command:

    kubectl get svc

    Example output:

    NAME                                     TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                               AGE
    test-disaggregated-cluster-fe ClusterIP 10.96.147.97 <none> 8030/TCP,9020/TCP,9030/TCP,9010/TCP 15m
    test-disaggregated-cluster-fe-internal ClusterIP None <none> 9030/TCP 15m
    test-disaggregated-cluster-ms ClusterIP 10.96.169.8 <none> 5000/TCP 15m
    test-disaggregated-cluster-cg1 ClusterIP 10.96.47.90 <none> 9060/TCP,8040/TCP,9050/TCP,8060/TCP 14m
    test-disaggregated-cluster-cg2 ClusterIP 10.96.50.199 <none> 9060/TCP,8040/TCP,9050/TCP,8060/TCP 14m

    The Service without the “-internal” suffix is intended for external access.

  2. Connect Using the MySQL Client Within the Kubernetes cluster, create a Pod containing the MySQL Client and enter the Pod:

    kubectl run mysql-client --image=mysql:5.7 -it --rm --restart=Never -- /bin/bash

    Within the Pod, connect to the Doris cluster directly using the Service name:、

    mysql -uroot -P9030 -h test-disaggregated-cluster-fe 
  3. Create the Storage Backend(Vault) Create an object storage backend supporting the S3 protocol as the Vault using SQL. For example:

    CREATE STORAGE VAULT IF NOT EXISTS s3_vault
    PROPERTIES (
    "type"="S3",
    "s3.endpoint" = "oss-cn-beijing.aliyuncs.com",
    "s3.region" = "bj",
    "s3.bucket" = "bucket",
    "s3.root.path" = "big/data/prefix",
    "s3.access_key" = "your-ak",
    "s3.secret_key" = "your-sk",
    "provider" = "OSS"
    );

    For instructions on creating other storage backends and detailed explanations of each field, please refer to the Managing Storage Vault section of the decoupled deployment documentation. Set the Default Storage Vault.

    SET {vaultName} AS DEFAULT STORAGE VAULT;

    Here, {vaultName} is the name of the Vault you wish to use, for example, s3_vault as created in the example above.