Skip to main content

Set up a RisingWave cluster in Kubernetes

This article will help you use the Kubernetes Operator for RisingWave (hereinafter ‘the Operator’) to deploy a RisingWave cluster in Kubernetes.

The Operator is a deployment and management system for RisingWave. It runs on top of Kubernetes and provides functionalities like provisioning, upgrading, scaling, and destroying the RisingWave instances inside the cluster.

Prerequisites

Create a Kubernetes cluster

info

The steps in this section are intended for creating a Kubernetes cluster in your local environment.
If you are using a managed Kubernetes service such as AKS, GKE, and EKS, refer to the corresponding documentation for instructions.

Steps:

  1. Install kind.

    kind is a tool for running local Kubernetes clusters using Docker containers as cluster nodes. You can see the available tags of kind on Docker Hub.

  2. Create a cluster.

    kind create cluster
  3. Optional: Check if the cluster is created properly.

    kubectl cluster-info

Deploy the Operator

Before the deployment, ensure that the following requirements are satisfied.

  • Docker version ≥ 18.09
  • kubectl version ≥ 1.18
  • For Linux, set the value of the sysctl parameter net.ipv4.ip_forward to 1.

Steps:

  1. Install cert-manager and wait a minute to allow for initialization.

  2. Install the latest version of the Operator.

    kubectl apply --server-side -f https://github.com/risingwavelabs/risingwave-operator/releases/latest/download/risingwave-operator.yaml
    If you'd like to install a certain version of the Operator

    Run the following command to install a specific version instead of the latest version.

    # Replace ${VERSION} with the version you want to install, e.g., v1.3.0
    kubectl apply --server-side -f https://github.com/risingwavelabs/risingwave-operator/releases/download/${VERSION}/risingwave-operator.yaml

    Compatibility table

    OperatorRisingWaveKubernetes
    v0.4.0v0.18.0+v1.21+
    v0.3.6v0.18.0+v1.21+

    You can find the release notes of each version here.

    note

    The following errors might occur if cert-manager is not fully initialized. Simply wait for another minute and rerun the command above.

    Error from server (InternalError): Internal error occurred: failed calling webhook "webhook.cert-manager.io": failed to call webhook: Post "<https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s>": dial tcp 10.105.102.32:443: connect: connection refused
  3. Optional: Check if the Pods are running.

    kubectl -n cert-manager get pods
    kubectl -n risingwave-operator-system get pods

Deploy a RisingWave instance

RisingWave Kubernetes Operator extends the Kubernetes with CRDs (Custom Resource Definitions) to manage RisingWave. That means all you need to do is to create a RisingWave resource in your Kubernetes cluster, and the RisingWave Kubernetes Operator will take care of the rest.

Use the example resource files

The RisingWave resource is a custom resource that defines a RisingWave cluster. In this directory, you can find resource examples that deploy RisingWave with different configurations of metadata store and state backend. Based on your requirements, you can use these resource files directly or as a reference for your customization. The stable directory contains resource files that we have tested compatibility with the latest released version of the RisingWave Operator:

The resource files are named using the convention of risingwave-<meta_store>-<state_backend>.yaml. For example, risingwave-etcd-s3.yaml means that this manifest file uses etcd as the meta storage and AWS S3 as the state backend. The resource files whose names do not contain etcd means that they use memory as the meta store, which does not persist meta node data and therefore has a risk of losing data. Note that for production deployments, you should use etcd as the metadata store. Therefore, please use a resource file that contains etcd in its name or choose a file that is in the /stable/ directory.

RisingWave supports using these systems or services as state backends.

  • MinIO
  • AWS S3
  • S3-compatible object storages
  • Google Cloud Storage
  • Azure Blob Storage
  • Alibaba Cloud OSS

You can customize etcd as a separate cluster, customize the state backend, or customize the state store directory.

Optional: Customize the etcd deployment

RisingWave uses etcd for persisting data for meta nodes. It's important to note that etcd is highly sensitive to disk write latency. Slow disk performance can lead to increased etcd request latency and potentially impact the stability of the cluster. When planning your RisingWave deployment, follow the etcd disk recommendations.

We recommend using the bitnami/etcd Helm chart to deploy the etcd. Please save the following configuration as etcd-values.yaml.

service:
ports:
client: 2379
peer: 2380

replicaCount: 3

resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 1
memory: 2Gi

persistence:
# storageClass: default
size: 10Gi
accessModes: [ "ReadWriteOnce" ]

auth:
rbac:
create: false
allowNoneAuthentication: true

extraEnvVars:
- name: "ETCD_MAX_REQUEST_BYTES"
value: "104857600"
- name: "MAX_QUOTA_BACKEND_BYTES"
value: "8589934592"
- name: "ETCD_AUTO_COMPACTION_MODE"
value: "periodic"
- name: "ETCD_AUTO_COMPACTION_RETENTION"
value: "1m"
- name: "ETCD_SNAPSHOT_COUNT"
value: "10000"
- name: "ETCD_MAX_TXN_OPS"
value: "999999"

If you would like to specify the storage class of the persistent volume, uncomment the # storageClass: default and specify the right value.

Then, run the following command to deploy the etcd cluster:

helm install -f etcd-values.yaml etcd bitnami/etcd

Optional: Customize the state backend

If you intend to customize a resource file, download the file to a local path and edit it:

curl https://raw.githubusercontent.com/risingwavelabs/risingwave-operator/main/docs/manifests/<sub-directory> -o risingwave.yaml

You can also create your own resource file from scratch if you are familiar with Kubernetes resource files.

Then, apply the resource file by using the following command:

kubectl apply -f a.yaml      # relative path
kubectl apply -f /tmp/a.yaml # absolute path

To customize the state backend of your RisingWave cluster, edit the spec:stateStore section under the RisingWave resource (kind: RisingWave).

spec:
stateStore:
# Prefix to objects in the object stores or directory in file system. Default to "hummock".
dataDirectory: hummock

# Declaration of the S3 state store backend.
s3:
# Region of the S3 bucket.
region: us-east-1

# Name of the S3 bucket.
bucket: risingwave

# Credentials to access the S3 bucket.
credentials:
# Name of the Kubernetes secret that stores the credentials.
secretName: s3-credentials

# Key of the access key ID in the secret.
accessKeyRef: AWS_ACCESS_KEY_ID

# Key of the secret access key in the secret.
secretAccessKeyRef: AWS_SECRET_ACCESS_KEY

# Optional, set it to true when the credentials can be retrieved
# with the service account token, e.g., running inside the EKS.
#
# useServiceAccount: true

Optional: Customize the state store directory

You can customize the directory for storing state data via the spec: stateStore: dataDirectory parameter in the risingwave.yaml file that you want to use to deploy a RisingWave instance. If you have multiple RisingWave instances, ensure the value of dataDirectory for the new instance is unique (the default value is hummock). Otherwise, the new RisingWave instance may crash. Save the changes to the risingwave.yaml file before running the kubectl apply -f <...risingwave.yaml> command. The directory path cannot be an absolute address, such as /a/b, and must be no longer than 180 characters.

Validate the status of the instance

You can check the status of the RisingWave instance by running the following command.

kubectl get risingwave

If the instance is running properly, the output should look like this:

NAME        RUNNING   STORAGE(META)   STORAGE(OBJECT)   AGE
risingwave True etcd S3 30s

Connect to RisingWave

By default, the Operator creates a service for the frontend component, through which you can interact with RisingWave, with the type of ClusterIP. But it is not accessible outside Kubernetes. Therefore, you need to create a standalone Pod for PostgreSQL inside Kubernetes.

Steps:

  1. Create a Pod.

    kubectl apply -f https://raw.githubusercontent.com/risingwavelabs/risingwave-operator/main/docs/manifests/psql/psql-console.yaml
  2. Attach to the Pod so that you can execute commands inside the container.

    kubectl exec -it psql-console -- bash
  3. Connect to RisingWave via psql.

    psql -h risingwave-frontend -p 4567 -d dev -U root

Now you can ingest and transform streaming data. See Quick start for details.

Help us make this doc better!