With Kubegres operator, we are going to create a cluster of 3 PostgreSql instances which are replicating their data in real time.
1) Install Kubegres operator
The installation needs to be done once. Run the following command in a Kubernetes cluster:
kubectl apply -f https://raw.githubusercontent.com/reactive-tech/kubegres/v1.12/kubegres.yaml
During the installation, "Kubegres" creates the namespace "kubegres-system" where it installs its components. Check their status as follows:
kubectl get all -n kubegres-system
We should see the controller "kubegres-controller-manager" in a running state:
NAME STATUS pod/kubegres-controller-manager-999786dd6-74tmb Running NAME TYPE service/kubegres-controller-manager-metrics-service ClusterIP NAME READY deployment.apps/kubegres-controller-manager 1/1
Once it is running, we can check the controller's logs, as follows:
kubectl logs pod/kubegres-controller-manager-999786dd6-74tmb -c manager -n kubegres-system -f
2) Check storage class
Kubegres requires a storage class in order to create PV (Persistent Volume) and PVC for each instance of PostgreSql. Please run the following command to check a storage class exist in the Kubernetes cluster:
kubectl get scFor example, using Kind as a local Kubernetes cluster, the above command outputs:
NAME PROVISIONER RECLAIMPOLICY standard (default) rancher.io/local-path Delete
3) Create a Secret resource
Before creating a cluster of PostgreSql, we need to create a Secret resource in order to store the passwords of a PostgreSql's super user and a replication user:
Create a file:
Add the following contents:
apiVersion: v1 kind: Secret metadata: name: mypostgres-secret namespace: default type: Opaque stringData: superUserPassword: postgresSuperUserPsw replicationUserPassword: postgresReplicaPsw
Apply the changes:
kubectl apply -f my-postgres-secret.yaml
Note that in your Secret YAML, you can set any name that you would like for the keys "superUserPassword" and "replicationUserPassword". The password values in the YAML are also for example purpose.
4) Create a cluster of PostgreSql instances
To create a cluster of PostgreSql, we need to create a Kubegres resource. The Kubegres YAML below contains the minimum required configurations to set-up:
- a cluster of PostgreSql pods with an official PostgreSql Docker container (version 12.4 or higher)
- the property "replica: 3" means, Kubegres will create 1 Primary PostgreSql pod and 2 Replica PostgreSql pods
- the data will be replicated in real time from the Primary PostgreSql pod to the 2 Replica PostgreSql pods
Create a file:
Add the following contents:
apiVersion: kubegres.reactive-tech.io/v1 kind: Kubegres metadata: name: mypostgres namespace: default spec: replicas: 3 image: postgres:13.2 database: size: 200Mi env: - name: POSTGRES_PASSWORD valueFrom: secretKeyRef: name: mypostgres-secret key: superUserPassword - name: POSTGRES_REPLICATION_PASSWORD valueFrom: secretKeyRef: name: mypostgres-secret key: replicationUserPassword
In the YAML above, under the field "spec.database" we only specified the field "size".
But you can also specify the fields "storageClassName" and "volumeMount". There are more details
about those fields in this page.
If you do not specify the field "storageClassName", Kubegres finds the default storage class of the Kubernetes cluster and assigns it. But you can also assign a value for that field manually. Once the storageClass is assigned, Kubernetes automatically provisions a PV and a PVC for each Postgres Pod.
The YAML above contains the minimum required configurations to deploy a cluster of PostgreSql. There are more configuration options available, such as the possibility to add annotations, set the name of a storageClass, specify a private repo in order to retrieve the container's image from it, enable backup, ... Please see the full list of options here.
Apply the changes:
kubectl apply -f my-postgres.yaml
A cluster of 3 PostgreSql instances is deploying in Kubernetes. Each PostgreSql instance is a pod. Check their status until they are "Running":
kubectl get pods -o wide -w
Kubegres logs events about the state of each PostgreSql cluster. Those events provide valuable information and would be very useful to debug any issue. We can access to them, as follows:
kubectl get events
5) The created resources
Kubegres operator will then create a cluster of PostgreSql pods as follows:
Let's check the created resources:
kubectl get pod,statefulset,svc,configmap,pv,pvc -o wide
Example of output:
NAME READY STATUS NODE pod/mypostgres-1-0 1/1 Running worker1 pod/mypostgres-2-0 1/1 Running worker2 pod/mypostgres-3-0 1/1 Running worker4 NAME READY statefulset.apps/mypostgres-1 1/1 statefulset.apps/mypostgres-2 1/1 statefulset.apps/mypostgres-3 1/1 NAME TYPE service/mypostgres ClusterIP service/mypostgres-replica ClusterIP NAME configmap/base-kubegres-config NAME CAPACITY persistentvolume/pvc-838... 200Mi persistentvolume/pvc-da6... 200Mi persistentvolume/pvc-e25... 200Mi NAME CAPACITY persistentvolumeclaim/postgres-db-mypostgres-1-0 200Mi persistentvolumeclaim/postgres-db-mypostgres-2-0 200Mi persistentvolumeclaim/postgres-db-mypostgres-3-0 200Mi
Kubegres has created 3 pods: "mypostgres-1-0", "mypostgres-2-0" and "mypostgres-3-0". Each of those pods runs a PostgreSql DB and are associated to a StatefulSet resource with the same name. And each pod is deployed in a distinct node to ensure strong resiliency in the case of a node failure.
Moreover, it created 2 Kubernetes clusterIP services: "mypostgres" and "mypostgres-replica". Those services allow client apps to access respectively to the Primary and Replica instances. There are more details about them in the next section below.
Kubegres created a base ConfigMap named " base-kubegres-config". It does that in each namespace where Kubegres resources are running. For more details about base ConfigMap, please see the page Override the default configurations.And finally, Kubegres provisioned one PV and one PVC for each Postgres Pod using the StorageClass.
Kubegres uses predefined templates to create Kubernetes resources. Those templates are available in GitHub.
From that point, we have a resilient cluster of 3 PostgreSql databases. It is time to connect a client app to that PostgreSql cluster.
6) Connect client apps to PostgreSql
Based on the Kubegres YAML that we have created, a client app located inside the same Kubernetes cluster would use the following configurations to connect to a PostgreSql database:
- host: mypostgres
- port: 5432
- username: postgres
- password: [value of mypostgres-secret/superUserPassword]
In this example, Kubegres created 2 Kubernetes Headless services (of default type ClusterIP) using the name defined in YAML (e.g. "mypostgres"):
- a Kubernetes service "mypostgres" allowing to access to the Primary PostgreSql instances
- a Kubernetes service "mypostgres-replica" allowing to access to the Replica PostgreSql instances
Consequently, a client app running inside a Kubernetes cluster, would use the hostname "mypostgres" to connect to the Primary PostgreSql for read and write requests, and optionally it can also use the hostname "mypostgres-replica" to connect to any of the available Replica PostgreSql for read requests.
This approach allows the client apps to access to the PostgreSql instances without any knowledge of IP addresses.
Note that because Kubegres creates Headless services and configure them with selectors, then no cluster IP addresses are created and the DNS resolves directly to the IPs of the Postgres Pods.
Please see the following diagram showing the 2 services that are created by Kubegres based on the YAML above:
If the Primary PostgreSql crashes, the automatic failover process will promote a Replica PostgreSql as a Primary. This process will be transparent for client apps as long as they are using the service name (e.g. "mypostgres") as the hostname. More details about this topic in the doc Replication and failover.
By default, PostgreSql is accessible via the port 5432. It is possible to modify this by adding the property "port" in the YAML above. All possible YAML properties are defined in the doc all properties explained.
Username and Password
By default, the only available user is "postgres" which is a super user. Consequently, for your custom databases it is recommended to create an additional user with limited permissions, for example with specific access permissions to a set of database(s). It is possible to do so by creating:
- an environment variable which contains the password of your custom PostgreSql user, as explained here.
- a ConfigMap where we can override the bash script "primary_init_script.sh" as shown here. In that bash script, it is possible to execute any SQL queries to create custom database(s), user(s) and anything else required to initialise PostgreSql.
7) Delete a cluster of Postgres
A cluster of Postgres can be deleted with the command:
kubectl delete kubegres [unique name]For example:
kubectl delete kubegres mypostgresThe above will delete all resources created for a cluster of Postgres identified by the name 'mypostgres'. It will delete resources such as Pods, Statefulsets, Services, ... The only resources that you have to remove manually are PV and PVC. There is one PV and one PVC created for each Pod. The reason why we don't to remove those resources is because they contain the database. In a future version of Kubegres it will be possible to set an option in the YAML so that they are automatically removed too.
Note that the command above will not delete other clusters of Postgres created with Kubegres. You have to run the command above with the name of the Kubegres resource for each cluster of Postgres to delete.
The above command does not delete the Kubegres controller/operator from your cluster. If you would like to delete it, you can run:
kubectl delete -f https://raw.githubusercontent.com/reactive-tech/kubegres/v1.12/kubegres.yaml
You can read about all properties that you can use in a YAML of "kind: Kubegres" in the page: All properties explained.
You can read about how Kubegres manages replication and failover.