This article will help you use the Kubernetes Operator for RisingWave (hereinafter ‘the Operator’) to deploy a RisingWave cluster in Kubernetes.
The Operator is a deployment and management system for RisingWave. It runs on top of Kubernetes and provides functionalities like provisioning, upgrading, scaling, and destroying the RisingWave instances inside the cluster.
The steps in this section are intended for creating a Kubernetes cluster in your local environment.
If you are using a managed Kubernetes service such as AKS, GKE, and EKS, refer to the corresponding documentation for instructions.
Run the following command to install a specific version instead of the latest version.
# Replace ${VERSION} with the version you want to install, e.g., v1.3.0kubectl apply --server-side -f https://github.com/risingwavelabs/risingwave-operator/releases/download/${VERSION}/risingwave-operator.yaml
Compatibility table
Operator
RisingWave
Kubernetes
v0.4.0
v0.18.0+
v1.21+
v0.3.6
v0.18.0+
v1.21+
You can find the release notes of each version here.
The following errors might occur if cert-manager is not fully initialized. Simply wait for another minute and rerun the command above.
Error from server (InternalError): Internal error occurred: failed calling webhook "webhook.cert-manager.io": failed to call webhook: Post "<https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s>": dial tcp 10.105.102.32:443: connect: connection refused
3
Optional: Check if the Pods are running.
kubectl -n cert-manager get podskubectl -n risingwave-operator-system get pods
RisingWave Kubernetes Operator extends the Kubernetes with CRDs (Custom Resource Definitions) to manage RisingWave. That means all you need to do is to create a RisingWave resource in your Kubernetes cluster, and the RisingWave Kubernetes Operator will take care of the rest.
The RisingWave resource is a custom resource that defines a RisingWave cluster. In this directory, you can find resource examples that deploy RisingWave with different configurations of metadata store and state backend. Based on your requirements, you can use these resource files directly or as a reference for your customization. The stable directory contains resource files that we have tested compatibility with the latest released version of the RisingWave Operator:
The resource files are named using the convention of risingwave-<meta_store>-<state_backend>.yaml. For example, risingwave-postgresql-s3.yaml means that this manifest file uses PostgreSQL as the meta storage and AWS S3 as the state backend.
RisingWave supports using these systems or services as state backends.
To customize the state backend of your RisingWave cluster, edit the spec:stateStore section under the RisingWave resource (kind: RisingWave).
spec: stateStore: # Prefix to objects in the object stores or directory in file system. Default to "hummock". dataDirectory: hummock # Declaration of the S3 state store backend. s3: # Region of the S3 bucket. region: us-east-1 # Name of the S3 bucket. bucket: risingwave # Credentials to access the S3 bucket. credentials: # Name of the Kubernetes secret that stores the credentials. secretName: s3-credentials # Key of the access key ID in the secret. accessKeyRef: AWS_ACCESS_KEY_ID # Key of the secret access key in the secret. secretAccessKeyRef: AWS_SECRET_ACCESS_KEY # Optional, set it to true when the credentials can be retrieved # with the service account token, e.g., running inside the EKS. # # useServiceAccount: true
spec: stateStore: # Prefix to objects in the object stores or directory in file system. Default to "hummock". dataDirectory: hummock # Declaration of the S3 state store backend. s3: # Region of the S3 bucket. region: us-east-1 # Name of the S3 bucket. bucket: risingwave # Credentials to access the S3 bucket. credentials: # Name of the Kubernetes secret that stores the credentials. secretName: s3-credentials # Key of the access key ID in the secret. accessKeyRef: AWS_ACCESS_KEY_ID # Key of the secret access key in the secret. secretAccessKeyRef: AWS_SECRET_ACCESS_KEY # Optional, set it to true when the credentials can be retrieved # with the service account token, e.g., running inside the EKS. # # useServiceAccount: true
The performance of MinIO is closely tied to the disk performance of the node where it is hosted. We have observed that AWS EBS does not perform well in our tests. For optimal performance, we recommend using S3 or a compatible cloud service.
spec: stateStore: # Prefix to objects in the object stores or directory in file system. Default to "hummock". dataDirectory: hummock # Declaration of the MinIO state store backend. minio: # Endpoint of the MinIO service. endpoint: risingwave-minio:9301 # Name of the MinIO bucket. bucket: hummock001 # Credentials to access the MinIO bucket. credentials: # Name of the Kubernetes secret that stores the credentials. secretName: minio-credentials # Key of the username ID in the secret. usernameKeyRef: username # Key of the password key in the secret. passwordKeyRef: password
spec: stateStore: # Prefix to objects in the object stores or directory in file system. Default to "hummock". dataDirectory: hummock # Declaration of the S3 compatible state store backend. s3: # Endpoint of the S3 compatible object storage. # # Here we use Tencent Cloud Object Store (COS) in ap-guangzhou as an example. endpoint: cos.ap-guangzhou.myqcloud.com # Region of the S3 compatible bucket. region: ap-guangzhou # Name of the S3 compatible bucket. bucket: risingwave # Credentials to access the S3 compatible bucket. credentials: # Name of the Kubernetes secret that stores the credentials. secretName: cos-credentials # Key of the access key ID in the secret. accessKeyRef: ACCESS_KEY_ID # Key of the secret access key in the secret. secretAccessKeyRef: SECRET_ACCESS_KEY
spec: stateStore: # Prefix to objects in the object stores or directory in file system. Default to "hummock". dataDirectory: hummock # Declaration of the Google Cloud Storage state store backend. azureBlob: # Endpoint of the Azure Blob service. endpoint: https://you-blob-service.blob.core.windows.net # Working directory root of the Azure Blob service. root: risingwave # Container name of the Azure Blob service. container: risingwave # Credentials to access the Google Cloud Storage bucket. credentials: # Name of the Kubernetes secret that stores the credentials. secretName: gcs-credentials # Key of the account name in the secret. accountNameRef: AccountName # Key of the account name in the secret. accountKeyRef: AccountKey
spec: stateStore: # Prefix to objects in the object stores or directory in file system. Default to "hummock". dataDirectory: hummock # Declaration of the Google Cloud Storage state store backend. gcs: # Name of the Google Cloud Storage bucket. bucket: risingwave # Root directory of the Google Cloud Storage bucket. root: risingwave # Credentials to access the Google Cloud Storage bucket. credentials: # Name of the Kubernetes secret that stores the credentials. secretName: gcs-credentials # Key of the service account credentials in the secret. serviceAccountCredentialsKeyRef: ServiceAccountCredentials # Optional, set it to true when the credentials can be retrieved. # useWorkloadIdentity: true
spec: stateStore: # Prefix to objects in the object stores or directory in file system. Default to "hummock". dataDirectory: hummock # Declaration of the Alibaba Cloud OSS state store backend. aliyunOSS: # Region of the Alibaba Cloud OSS bucket. region: cn-hangzhou # Name of the Alibaba Cloud OSS compatible bucket. bucket: risingwave # Use internal endpoint or not. Check the following document for details: # https://www.alibabacloud.com/help/en/oss/user-guide/regions-and-endpoints internalEndpoint: false # Credentials to access the Alibaba Cloud OSS bucket. credentials: # Name of the Kubernetes secret that stores the credentials. secretName: oss-credentials # Key of the access key ID in the secret. accessKeyRef: ACCESS_KEY_ID # Key of the secret access key in the secret. secretAccessKeyRef: SECRET_ACCESS_KEY_ID
You can also customize the HDFS client as needed. We have built an image based on Hadoop 2.7.3. If you want to create your own RisingWave image, please adjust the Hadoop configuration according to your specific cluster information and ensure that:
The CLASSPATH is correctly set.
HADOOP_CONF_DIR is placed at the beginning of the CLASSPATH.
You can customize the directory for storing state data via the spec: stateStore: dataDirectory parameter in the risingwave.yaml file that you want to use to deploy a RisingWave instance. If you have multiple RisingWave instances, ensure the value of dataDirectory for the new instance is unique (the default value is hummock). Otherwise, the new RisingWave instance may crash. Save the changes to the risingwave.yaml file before running the kubectl apply -f <...risingwave.yaml> command. The directory path cannot be an absolute address, such as /a/b, and must be no longer than 180 characters.
By default, the Operator creates a service for the frontend component, through which you can interact with RisingWave, with the type of ClusterIP. But it is not accessible outside Kubernetes. Therefore, you need to create a standalone Pod for PostgreSQL inside Kubernetes.
Attach to the Pod so that you can execute commands inside the container.
kubectl exec -it psql-console -- bash
3
Connect to RisingWave via `psql`.
psql -h risingwave-frontend -p 4567 -d dev -U root
By default, the Operator creates a service for the frontend component, through which you can interact with RisingWave, with the type of ClusterIP. But it is not accessible outside Kubernetes. Therefore, you need to create a standalone Pod for PostgreSQL inside Kubernetes.
Attach to the Pod so that you can execute commands inside the container.
kubectl exec -it psql-console -- bash
3
Connect to RisingWave via `psql`.
psql -h risingwave-frontend -p 4567 -d dev -U root
You can connect to RisingWave from Nodes such as EC2 in Kubernetes
In the risingwave.yaml file that you use to deploy the RisingWave instance, add a frontendServiceType parameter to the configuration of the RisingWave service, and set its value to NodePort.
Connect to RisingWave by running the following commands on the Node.
export RISINGWAVE_NAME=risingwave-postgresql-hdfsexport RISINGWAVE_NAMESPACE=defaultexport RISINGWAVE_HOST=`kubectl -n ${RISINGWAVE_NAMESPACE} get node -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}'`export RISINGWAVE_PORT=`kubectl -n ${RISINGWAVE_NAMESPACE} get svc -l risingwave/name=${RISINGWAVE_NAME},risingwave/component=frontend -o jsonpath='{.items[0].spec.ports[0].nodePort}'`psql -h ${RISINGWAVE_HOST} -p ${RISINGWAVE_PORT} -d dev -U root
If you are using EKS, GCP, or other managed Kubernetes services provided by cloud vendors, you can expose the Service to the public network with a load balancer in the cloud.
In the risingwave.yaml file that you use to deploy the RisingWave instance, add a frontendServiceType parameter to the configuration of the RisingWave service, and set its value to LoadBalancer.