With Kubegres operator, we are going to create a cluster of 3 PostgreSql instances which are replicating their data in real time.
1) Install Kubegres operator
The installation needs to be done once. Run the following command in a Kubernetes cluster:
kubectl apply -f https://raw.githubusercontent.com/reactive-tech/kubegres/v1.7/kubegres.yaml
During the installation, "Kubegres" creates the namespace "kubegres-system" where it installs its components. Check their status as follows:
kubectl get all -n kubegres-system
We should see the controller "kubegres-controller-manager" in a running state:
NAME STATUS pod/kubegres-controller-manager-999786dd6-74tmb Running NAME TYPE service/kubegres-controller-manager-metrics-service ClusterIP NAME READY deployment.apps/kubegres-controller-manager 1/1
Once it is running, we can check the controller's logs, as follows:
kubectl logs pod/kubegres-controller-manager-999786dd6-74tmb -c manager -n kubegres-system -f
2) Create a Secret resource
Before creating a cluster of PostgreSql, we need to create a Secret resource in order to store the passwords of a PostgreSql's super user and a replication user:
Create a file:
Add the following contents:
apiVersion: v1 kind: Secret metadata: name: mypostgres-secret namespace: default type: Opaque stringData: superUserPassword: postgresSuperUserPsw replicationUserPassword: postgresReplicaPsw
Apply the changes:
kubectl apply -f my-postgres-secret.yaml
Note that in your Secret YAML, you can set any name that you would like for the keys "superUserPassword" and "replicationUserPassword". The password values in the YAML are also for example purpose.
3) Create a cluster of PostgreSql instances
To create a cluster of PostgreSql, we need to create a Kubegres resource. The Kubegres YAML below contains the minimum required configurations to set-up:
- a cluster of PostgreSql pods with an official PostgreSql Docker container version 13.2
- the property "replica: 3" means, Kubegres will create 1 Primary PostgreSql pod and 2 Replica PostgreSql pods
- the data will be replicated in real time from the Primary PostgreSql pod to the 2 Replica PostgreSql pods
Create a file:
Add the following contents:
apiVersion: kubegres.reactive-tech.io/v1 kind: Kubegres metadata: name: mypostgres namespace: default spec: replicas: 3 image: postgres:13.2 database: size: 200Mi env: - name: POSTGRES_PASSWORD valueFrom: secretKeyRef: name: mypostgres-secret key: superUserPassword - name: POSTGRES_REPLICATION_PASSWORD valueFrom: secretKeyRef: name: mypostgres-secret key: replicationUserPassword
Note: the YAML above contains the minimum required configurations to deploy a cluster of PostgreSql. There are more configuration options available, such as the possibility to add annotations, set the name of a storageClass, specify a private repo in order to retrieve the container's image from it, enable backup, ... Please see the full list of options here.
Apply the changes:
kubectl apply -f my-postgres.yaml
A cluster of 3 PostgreSql instances is deploying in Kubernetes. Each PostgreSql instance is a pod. Check their status until they are "Running":
kubectl get pods -o wide -w
Kubegres logs events about the state of each PostgreSql cluster. Those events provide valuable information and would be very useful to debug any issue. We can access to them, as follows:
kubectl get events
4) The created resources
Kubegres operator will then create a cluster of PostgreSql pods as follows:
Let's check the created resources:
kubectl get pod,statefulset,svc,configmap -o wide
Example of output:
NAME READY STATUS NODE pod/mypostgres-1-0 1/1 Running worker1 pod/mypostgres-2-0 1/1 Running worker2 pod/mypostgres-3-0 1/1 Running worker4 NAME READY statefulset.apps/mypostgres-1 1/1 statefulset.apps/mypostgres-2 1/1 statefulset.apps/mypostgres-3 1/1 NAME TYPE service/mypostgres ClusterIP service/mypostgres-replica ClusterIP NAME configmap/base-kubegres-config
Kubegres has created 3 pods: "mypostgres-1-0", "mypostgres-2-0" and "mypostgres-3-0". Each of those pods runs a PostgreSql DB and are associated to a StatefulSet resource with the same name. And each pod is deployed in a distinct node to ensure strong resiliency in the case of a node failure.
Moreover, it created 2 Kubernetes clusterIP services: "mypostgres" and "mypostgres-replica". Those services allow client apps to access respectively to the Primary and Replica instances. There are more details about them in the next section below.
Kubegres created a base ConfigMap named " base-kubegres-config". It does that in each namespace where Kubegres resources are running. For more details about base ConfigMap, please see the page Override the default configurations.
Kubegres uses predefined templates to create Kubernetes resources. Those templates are available in GitHub.
From that point, we have a resilient cluster of 3 PostgreSql databases. It is time to connect a client app to that PostgreSql cluster.
5) Connect client apps to PostgreSql
Based on the Kubegres YAML that we have created, a client app located inside the same Kubernetes cluster would use the following configurations to connect to a PostgreSql database:
- host: mypostgres
- port: 5432
- username: postgres
- password: [value of mypostgres-secret/superUserPassword]
In this example, Kubegres created 2 Kubernetes clusterIP services using the name defined in YAML (e.g. "mypostgres"):
- a Kubernetes service "mypostgres" allowing to access to the Primary PostgreSql instances
- a Kubernetes service "mypostgres-replica" allowing to access to the Replica PostgreSql instances
Consequently, a client app running inside a Kubernetes cluster, would use the hostname "mypostgres" to connect to the Primary PostgreSql for read and write requests, and optionally it can also use the hostname "mypostgres-replica" to connect to any of the available Replica PostgreSql for read requests.
This approach allows the client apps to access to the PostgreSql instances without any knowledge of IP addresses.
Please see the following diagram showing the 2 services that are created by Kubegres based on the YAML above:
If the Primary PostgreSql crashes, the automatic failover process will promote a Replica PostgreSql as a Primary. This process will be transparent for client apps as long as they are using the service name (e.g. "mypostgres") as the hostname. More details about this topic in the doc Replication and failover.
By default, PostgreSql is accessible via the port 5432. It is possible to modify this by adding the property "port" in the YAML above. All possible YAML properties are defined in the doc all properties explained.
Username and Password
By default, the only available user is "postgres" which is a super user. Consequently, for your custom databases it is recommended to create an additional user with limited permissions, for example with specific access permissions to a set of database(s). It is possible to do so by creating:
- an environment variable which contains the password of your custom PostgreSql user, as explained here.
- a ConfigMap where we can override the bash script "primary_init_script.sh" as shown here. In that bash script, it is possible to execute any SQL queries to create custom database(s), user(s) and anything else required to initialise PostgreSql.
You can read about how Kubegres manages replication and failover.