Backup and Restore Using Helm Charts
This document describes how to back up and restore the data of a TiDB cluster in Kubernetes using Helm charts.
For TiDB Operator 1.1 or later versions, it is recommended that you use the backup and restoration methods based on CustomResourceDefinition (CRD).
- If the TiDB cluster version < v3.1, refer to the following documents:
- If the TiDB cluster version >= v3.1, refer to the following documents:
TiDB in Kubernetes supports two backup strategies using Helm charts:
- Full backup (scheduled or ad-hoc): use
mydumper
to take a logical backup of the TiDB cluster. - Incremental backup: use TiDB Binlog to replicate data from the TiDB cluster to another database or execute a real-time backup of the data.
Currently, TiDB in Kubernetes only supports automatic restoration for full backup taken by mydumper
. Restoring the incremental backup data by TiDB Binlog
requires manual operations.
Full backup
Full backup uses mydumper
to take a logical backup of a TiDB cluster. The backup task creates a PVC (PersistentVolumeClaim) to store data.
In the default configuration, the backup uses PV (Persistent Volume) to store backup data. You can also store the data in Google Cloud Storage buckets, Ceph Object Storage or Amazon S3 by changing the configuration. In this case, the backup data is temporarily stored in the PV before it is uploaded to object storage. Refer to TiDB cluster backup configuration for all configuration options you have.
You can either set up a scheduled full backup job or take a full backup in an ad-hoc manner.
Scheduled full backup
Scheduled full backup is a task created alongside the TiDB cluster, and it runs periodically like crontab
.
To configure a scheduled full backup, modify the scheduledBackup
section in the values.yaml
file of the TiDB cluster:
Set
scheduledBackup.create
totrue
.Set
scheduledBackup.storageClassName
to thestorageClass
of the PV that stores the backup data.Configure
scheduledBackup.schedule
in the Cron format to define the scheduling.Create a Kubernetes Secret containing the username and password (the user must have the privileges to back up the data). Meanwhile, set
scheduledBackup.secretName
to the name of the createdSecret
(default tobackup-secret
):kubectl create secret generic backup-secret -n ${namespace} --from-literal=user=${user} --from-literal=password=${password}Create a new TiDB cluster with the scheduled full backup task by running
helm install
, or enable the scheduled full backup for the existing cluster byhelm upgrade
:helm upgrade ${release_name} pingcap/tidb-cluster -f values.yaml --version=${version}
Ad-hoc full backup
Ad-hoc full backup is encapsulated in a helm chart - pingcap/tidb-backup
. According to the mode
configuration in the values.yaml
file, this chart can perform either full backup or data restoration. The restore section covers how to restore the backup data.
Follow the steps below to perform an ad-hoc full backup task:
Modify the
values.yaml
file:- Set
clusterName
to the target TiDB cluster name. - Set
mode
tobackup
. - Set
storage.className
to thestorageClass
of the PV that stores the backup data. - Adjust the
storage.size
according to your database size.
- Set
Create a Kubernetes Secret containing the username and password (the user must have the privileges to back up the data). Meanwhile, set
secretName
in thevalues.yaml
file to the name of the createdSecret
(default tobackup-secret
):kubectl create secret generic backup-secret -n ${namespace} --from-literal=user=${user} --from-literal=password=${password}Run the following command to perform an ad-hoc backup task:
helm install ${backup_name} pingcap/tidb-backup --namespace=${namespace} -f values.yaml --version=${version}
View backups
For backups stored in PV, you can view them by using the following command:
kubectl get pvc -n ${namespace} -l app.kubernetes.io/component=backup,pingcap.com/backup-cluster-name=${cluster_name}
If you store your backup data in Google Cloud Storage, Ceph Object Storage or Amazon S3, you can view the backups by using the GUI or CLI tools provided by these storage providers.
Restore
The pingcap/tidb-backup
helm chart helps restore a TiDB cluster using backup data. Follow the steps below to restore:
Modify the
values.yaml
file:- Set
clusterName
to the target TiDB cluster name. - Set
mode
torestore
. - Set
name
to the name of the backup you want to restore (refer to view backups to view available backups). If the backup is stored in Google Cloud Storage, Ceph Object Storage or Amazon S3, you must configure the corresponding sections and make sure that the same configurations are applied as you perform the full backup.
- Set
Create a Kubernetes Secret containing the user and password (the user must have the privileges to back up the data). Meanwhile, set
secretName
in thevalues.yaml
file to the name of the createdSecret
(default tobackup-secret
; skip this if you have already created one when you perform full backup):kubectl create secret generic backup-secret -n ${namespace} --from-literal=user=${user} --from-literal=password=${password}Restore the backup:
helm install ${restore_name} pingcap/tidb-backup --namespace=${namespace} -f values.yaml --version=${version}
Incremental backup
Incremental backup uses TiDB Binlog to collect binlog data from TiDB and provide near real-time backup and replication to downstream platforms.
For the detailed guide of maintaining TiDB Binlog in Kubernetes, refer to TiDB Binlog.
Scale in Pump
To scale in Pump, for each Pump node, make the node offline and then run the helm upgrade
command to delete the corresponding Pump Pod.
Make a Pump node offline from the TiDB cluster
Suppose there are 3 Pump nodes, and you want to get the third node offline and modify
${ordinal_id}
to2
, run the following command (${version}
is the current version of TiDB).kubectl run offline-pump-${ordinal_id} --image=pingcap/tidb-binlog:${version} --namespace=${namespace} --restart=OnFailure -- /binlogctl -pd-urls=http://${release_name}-pd:2379 -cmd offline-pump -node-id ${release_name}-pump-${ordinal_id}:8250Then, check the log output of Pump. If Pump outputs
pump offline, please delete my pod
, the state of the Pump node is successfully switched tooffline
.kubectl logs -f -n ${namespace} ${release_name}-pump-${ordinal_id}Delete the corresponding Pump Pod
Modify
binlog.pump.replicas
in thevalues.yaml
file to2
and then run the following command to delete the Pump Pod.helm upgrade ${release_name} pingcap/tidb-cluster -f values.yaml --version=${chart_version}