Access the Doris cluster
In Kubernetes, a Service is a resource that defines a set of pods and provides stable access to this set of pods, so it can be associated with a set of pods.
Deploying a cluster through Doris-Operator will automatically generate corresponding Service resources according to the spec.*Spec.service
configuration. Currently, ClusterIP, LoadBalancer and NodePort modes are supported. Support users’ access needs in different scenarios.
Access within the Kubernetes cluster
ClusterIP mode
Doris provides ClusterIP
access within the kubernetes cluster by default on kubernetes. For FE and BE components, we provide corresponding Service resources for users to use on demand on kubernetes. Use the following command to view the Service of the corresponding component. The Service naming rule provided by Doris-Operator is {clusterName}-{fe/be}-service
.
$ kubectl -n {namespace} get service
During use, please replace {namespace} with the namespace specified during deployment. Take our default sample deployed Doris cluster as an example:
apiVersion: doris.selectdb.com/v1
kind: DorisCluster
metadata:
labels:
app.kubernetes.io/name: doriscluster
app.kubernetes.io/instance: doriscluster-sample
app.kubernetes.io/part-of: doris-operator
name: doriscluster-sample
spec:
feSpec:
replicas: 3
limits:
cpu: 6
memory: 12Gi
requests:
cpu: 6
memory: 12Gi
image: selectdb/doris.fe-ubuntu:2.0.2
beSpec:
replicas: 3
limits:
cpu: 8
memory: 16Gi
requests:
cpu: 8
memory: 16Gi
image: selectdb/doris.be-ubuntu:2.0.2
We view kubectl get service
through the command as follows:
$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
doriscluster-sample-be-internal ClusterIP None <none> 9050/TCP 12h
doriscluster-sample-be-service ClusterIP 172.20.217.234 <none> 9060/TCP,8040/TCP,9050/TCP,8060/TCP 12h
doriscluster-sample-fe-internal ClusterIP None <none> 9030/TCP 12h
doriscluster-sample-fe-service ClusterIP 172.20.183.136 <none> 8030/TCP,9020/TCP,9030/TCP,9010/TCP 12h
There are two types of services, FE and BE, displayed through the command. The service with the suffix internal
is the service used by Doris for internal communication and is not available externally. The suffix -service
is a Service for users to use.
In this example, the CLUSTER-IP
corresponding to doriscluster-sample-fe-service
and the corresponding PORT
can be used on the K8s cluster to access different port services of FE. Using doriscluster-sample-be-service
Service
And the corresponding PORT
port to access BE's services.
Access outside the Kubernetes cluster
LoadBalancer Mode
If the cluster is created on a relevant cloud platform, it is recommended to use the LoadBalancer
mode to access the FE and BE services within the cluster. The ClusterIP
mode is used by default. If you want to use the LoadBalancer
mode, please configure the following configuration in the spec of each component:
service:
type: LoadBalancer
Taking the default configuration as a modification blueprint for example, we use LoadBalancer
as the access mode of FE and BE on the cloud platform. The deployment configuration is as follows:
apiVersion: doris.selectdb.com/v1
kind: DorisCluster
metadata:
labels:
app.kubernetes.io/name: doriscluster
app.kubernetes.io/instance: doriscluster-sample
app.kubernetes.io/part-of: doris-operator
name: doriscluster-sample
spec:
feSpec:
replicas: 3
service:
type: LoadBalancer
limits:
cpu: 6
memory: 12Gi
requests:
cpu: 6
memory: 12Gi
image: selectdb/doris.fe-ubuntu:2.0.2
beSpec:
replicas: 3
service:
type: LoadBalancer
limits:
cpu: 8
memory: 16Gi
requests:
cpu: 8
memory: 16Gi
image: selectdb/doris.be-ubuntu:2.0.2
By viewing the kubectl get service
command, view the corresponding Service display as follows:
$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
doriscluster-sample-be-internal ClusterIP None <none> 9050/TCP 14h
doriscluster-sample-be-service LoadBalancer 172.20.217.234 a46bbcd6998c7436bab8ee8fba9f5e81-808549982.us-east-1.elb.amazonaws.com 9060:32060/TCP,8040:30615/TCP,9050:31742/TCP,8060:31127/TCP 14h
doriscluster-sample-fe-internal ClusterIP None <none> 9030/TCP 14h
doriscluster-sample-fe-service LoadBalancer 172.20.183.136 ac48284932b044251bfac389b453118f-1412731848.us-east-1.elb.amazonaws.com 8030:32213/TCP,9020:31080/TCP,9030:31433/TCP,9010:30585/TCP 14h
External ports corresponding to EXTERNAL-IP
and PORT
can be used outside K8s to access the services of various components within K8s. For example, to access the mysql
client service corresponding to FE's 9030, you can use the following command to connect:
mysql -h ac48284932b044251bfac389b453118f-1412731848.us-east-1.elb.amazonaws.com -P 9030 -uroot
NodePort Mode
In a private network environment, to access internal services outside K8s, it is recommended to use the NodePort
mode of K8s. Use the default configuration as a blueprint to configure the NodePort
access mode in the private network as follows:
apiVersion: doris.selectdb.com/v1
kind: DorisCluster
metadata:
labels:
app.kubernetes.io/name: doriscluster
app.kubernetes.io/instance: doriscluster-sample
app.kubernetes.io/part-of: doris-operator
name: doriscluster-sample
spec:
feSpec:
replicas: 3
service:
type: NodePort
limits:
cpu: 6
memory: 12Gi
requests:
cpu: 6
memory: 12Gi
image: selectdb/doris.fe-ubuntu:2.0.2
beSpec:
replicas: 3
service:
type: NodePort
limits:
cpu: 8
memory: 16Gi
requests:
cpu: 8
memory: 16Gi
image: selectdb/doris.be-ubuntu:2.0.2
After deployment, view the corresponding Service by viewing the kubectl get service
command:
$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 169d
doriscluster-sample-fe-internal ClusterIP None <none> 9030/TCP 2d
doriscluster-sample-fe-service NodePort 10.152.183.58 <none> 8030:31041/TCP,9020:30783/TCP,9030:31545/TCP,9010:31610/TCP 2d
doriscluster-sample-be-internal ClusterIP None <none> 9050/TCP 2d
doriscluster-sample-be-service NodePort 10.152.183.244 <none> 9060:30940/TCP,8040:32713/TCP,9050:30621/TCP,8060:30926/TCP 2d
The above command obtains the port that can be used outside K8s, and obtains the host managed by K8s through the following command:
$ kubectl get nodes -owide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
vm-10-7-centos Ready <none> 88d v1.23.17-2+40cc20cc310518 10.16.10.7 <none> TencentOS Server 3.1 (Final) 5.4.119-19.0009.25 containerd://1.5.13
vm-10-8-centos Ready <none> 169d v1.23.17-2+40cc20cc310518 10.16.10.8 <none> TencentOS Server 3.1 (Final) 5.4.119-19-0009.3 containerd://1.5.13
In a private network environment, use the K8s host and mapped ports to access K8s internal services. For example, we use the host's IP and FE's 9030 mapped port (31545) to connect to mysql
:
$ mysql -h 10.16.10.8 -P 31545 -uroot
In addition, you can specify the nodePort you need according to your own platform needs. The Kubernetes master will allocate a port from the given configuration range (general default: 30000-32767) and each Node will proxy to the Service from that port (the same port on each Node). Like the example above, a random port will be automatically generated if not specified.
apiVersion: doris.selectdb.com/v1
kind: DorisCluster
metadata:
labels:
app.kubernetes.io/name: doriscluster
app.kubernetes.io/instance: doriscluster-sample
app.kubernetes.io/part-of: doris-operator
name: doriscluster-sample
spec:
feSpec:
replicas: 3
service:
type: NodePort
servicePorts:
- nodePort: 31001
targetPort: 8030
- nodePort: 31002
targetPort: 9020
- nodePort: 31003
targetPort: 9030
- nodePort: 31004
targetPort: 9010
limits:
cpu: 6
memory: 12Gi
requests:
cpu: 6
memory: 12Gi
image: selectdb/doris.fe-ubuntu:2.0.2
beSpec:
replicas: 3
service:
type: NodePort
servicePorts:
- nodePort: 31005
targetPort: 9060
- nodePort: 31006
targetPort: 8040
- nodePort: 31007
targetPort: 9050
- nodePort: 31008
targetPort: 8060
limits:
cpu: 8
memory: 16Gi
requests:
cpu: 8
memory: 16Gi
image: selectdb/doris.be-ubuntu:2.0.2
After deployment, check the corresponding Service by viewing the kubectl get service
command. For access methods, please refer to the above:
$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 169d
doriscluster-sample-fe-internal ClusterIP None <none> 9030/TCP 2d
doriscluster-sample-fe-service NodePort 10.152.183.67 <none> 8030:31001/TCP,9020:31002/TCP,9030:31003/TCP,9010:31004/TCP 2d
doriscluster-sample-be-internal ClusterIP None <none> 9050/TCP 2d
doriscluster-sample-be-service NodePort 10.152.183.24 <none> 9060:31005/TCP,8040:31006/TCP,9050:31007/TCP,8060:31008/TCP 2d
Doris data exchange
Stream load
Stream load is a synchronous import method. Users send requests to import local files or data streams into Doris by sending HTTP protocol. In a regular deployment, users submit import commands via the HTTP protocol. Generally, users will submit the request to FE, and FE will forward the request to a certain BE through the HTTP redirect command. However, in a Kubernetes-based deployment scenario, it is recommended that users directly submit the import command to BE's Srevice, and then the Service will be load balanced to a certain BE pod based on Kubernetes rules. The actual effects of these two operations are the same. When Flink or Spark uses the official connector to submit, the write request can also be submitted to the BE Service.
ErrorURL
When import methods such as Stream load and Routine load
These import methods will print errorURL
or tracking_url
in the return structure or log when encountering errors such as incorrect data format. You can locate the cause of the import error by visiting this link.
However, this URL is only accessible within the internal environment of a specific BE node container in a Kubernetes deployed cluster.
The following scenario takes the errorURL returned by Doris as an example:
http://doriscluster-sample-be-2.doriscluster-sample-be-internal.doris.svc.cluster.local:8040/api/_load_error_log?file=__shard_1/error_log_insert_stmt_af474190276a2e9c-49bb9d175b8e968e_af474190276a2e9c_49bb9d175b8e968e
1. Kubernetes cluster internal access
You need to obtain the access method of BE's Service or pod through the kubectl get service
or kubectl get pod -o wide
command, replace the domain name and port of the original URL, and then access again.
for example:
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
doriscluster-sample-be-0 1/1 Running 0 9h 10.0.2.105 10-0-2-47.ec2.internal <none> <none>
doriscluster-sample-be-1 1/1 Running 0 9h 10.0.2.104 10-0-2-5.ec2.internal <none> <none>
doriscluster-sample-be-2 1/1 Running 0 9h 10.0.2.103 10-0-2-6.ec2.internal <none> <none>
doriscluster-sample-fe-0 1/1 Running 0 9h 10.0.2.102 10-0-2-47.ec2.internal <none> <none>
doriscluster-sample-fe-1 1/1 Running 0 9h 10.0.2.101 10-0-2-5.ec2.internal <none> <none>
doriscluster-sample-fe-2 1/1 Running 0 9h 10.0.2.100 10-0-2-6.ec2.internal <none> <none>
The above errorURL is changed to:
http://10.0.2.103:8040/api/_load_error_log?file=__shard_1/error_log_insert_stmt_af474190276a2e9c-49bb9d175b8e968e_af474190276a2e9c_49bb9d175b8e968e
2. NodePort mode for external access to Kubernetes cluster
Obtaining error report details from outside Kubernetes requires additional bridging means. The following are the processing steps for using NodePort mode Service when deploying Doris. Obtain error report details by creating a new Service: Handle Service template be_streamload_errror_service.yaml:
apiVersion: v1
kind: Service
metadata:
labels:
app.doris.service/role: debug
app.kubernetes.io/component: be
name: doriscluster-detail-error
spec:
externalTrafficPolicy: Cluster
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: webserver-port
port: 8040
protocol: TCP
targetPort: 8040
selector:
app.kubernetes.io/component: be
statefulset.kubernetes.io/pod-name: ${podName}
sessionAffinity: None
type: NodePort
Use the following command to view the mapping of Service 8040 port on the host machine
$ kubectl get service -n doris doriscluster-detail-error
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
doriscluster-detail-error NodePort 10.152.183.35 <none> 8040:31201/TCP 32s
Use the IP of any host and the NodePort (31201) port obtained above to access and replace errorURL to obtain a detailed error report.
The above errorURL is changed to:
http://10.152.183.35:31201/api/_load_error_log?file=__shard_1/error_log_insert_stmt_af474190276a2e9c-49bb9d175b8e968e_af474190276a2e9c_49bb9d175b8e968e
3. Access LoadBalancer mode from outside the Kubernetes cluster
Obtaining error report details from outside Kubernetes requires additional bridging means. The following are the processing steps for using the LoadBalancer mode Service when deploying Doris. Obtain error report details by creating a new Service: Handle Service template be_streamload_errror_service.yaml:
apiVersion: v1
kind: Service
metadata:
labels:
app.doris.service/role: debug
app.kubernetes.io/component: be
name: doriscluster-detail-error
spec:
externalTrafficPolicy: Cluster
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: webserver-port
port: 8040
protocol: TCP
targetPort: 8040
selector:
app.kubernetes.io/component: be
statefulset.kubernetes.io/pod-name: ${podName}
sessionAffinity: None
type: LoadBalancer
podName
is replaced with the highest-level domain name of errorURL: doriscluster-sample-be-2
.
In the namespace
deployed by Doris (generally the default is doris
, use doris
for the following operations), use the following command to deploy the service processed above:
$ kubectl apply -n doris -f be_streamload_errror_service.yaml
Use the following command to view the mapping of Service 8040 port on the host machine
$ kubectl get service -n doris doriscluster-detail-error
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
doriscluster-detail-error LoadBalancer 172.20.183.136 ac4828493dgrftb884g67wg4tb68gyut-1137856348.us-east-1.elb.amazonaws.com 8040:32003/TCP 14s
Use EXTERNAL-IP
and Port (8040) port access to replace errorURL to obtain a detailed error report.
The above errorURL is changed to:
http://ac4828493dgrftb884g67wg4tb68gyut-1137856348.us-east-1.elb.amazonaws.com:8040/api/_load_error_log?file=__shard_1/error_log_insert_stmt_af474190276a2e9c-49bb9d175b8e968 e_af474190276a2e9c_49bb9d175b8e968e
Note: For the above method of setting up access outside the cluster, it is recommended to clear the current Service after use. The reference command is as follows:
$ kubectl delete service -n doris doriscluster-detail-error