Restore a TiDB Cluster across Multiple Kubernetes from EBS Volume Snapshots

This document describes how to restore backup data in AWS EBS snapshots to a TiDB cluster across multiple Kubernetes clusters.

The restore method described in this document is implemented based on CustomResourceDefinition (CRD) in BR Federation and TiDB Operator. BR (Backup & Restore) is a command-line tool for distributed backup and recovery of the TiDB cluster data. For the underlying implementation, BR restores the data.

Limitations

  • Snapshot restore is applicable to TiDB Operator v1.5.2 or later versions and TiDB v6.5.8 or later versions.
  • You can use snapshot restore only to restore data to a cluster with the same number of TiKV nodes and volumes configuration. That is, the number of TiKV nodes and volume configurations of TiKV nodes are identical between the restore cluster and backup cluster.
  • Snapshot restore is currently not supported for TiFlash, TiCDC, DM, and TiDB Binlog nodes.

Prerequisites

Before restoring a TiDB cluster across multiple Kubernetes clusters from EBS volume snapshots, you need to complete the following preparations.

If you choose fsr as the warmup strategy, you need to grant permissions of ec2: EnableFastSnapshotRestores, ec2: DisableFastSnapshotRestores and ec2: DescribeFastSnapshotRestores and coudwatch:GetMetricStatistics to the IAM role. And you also need to increase the EBS service quota of Fast snapshot restore to at least the number of TiKV nodes.

Restore process

Step 1. Set up the environment for EBS volume snapshot restore in every data plane

You must execute the following steps in every data plane.

  1. Download the backup-rbac.yaml file to the restore server.

  2. Create the RBAC-related resources required for the restore by running the following command. Note that the RBAC-related resources must be put in the same ${namespace} as the TiDB cluster.

    kubectl apply -f backup-rbac.yaml -n ${namespace}
  3. Grant permissions to access remote storage.

    To restore data from EBS snapshots, you need to grant permissions to remote storage. Three ways are available. Refer to AWS account authorization for the three available methods.

Step 2. Restore data to the TiDB cluster

You must execute the following steps in the control plane.

Depending on the authorization method you choose in the previous step for granting remote storage access, you can restore data to TiDB using any of the following methods accordingly:

  • AK/SK
  • IAM role with Pod
  • IAM role with ServiceAccount

If you grant permissions by accessKey and secretKey, you can create the VolumeRestore CR as follows:

kubectl apply -f restore-fed.yaml

The restore-fed.yaml file has the following content:

--- apiVersion: federation.pingcap.com/v1alpha1 kind: VolumeRestore metadata: name: ${restore-name} spec: clusters: - k8sClusterName: ${k8s-name1} tcName: ${tc-name1} tcNamespace: ${tc-namespace1} backup: s3: provider: aws secretName: s3-secret region: ${region-name} bucket: ${bucket-name} prefix: ${backup-path1} - k8sClusterName: ${k8s-name2} tcName: ${tc-name2} tcNamespace: ${tc-namespace2} backup: s3: provider: aws secretName: s3-secret region: ${region-name} bucket: ${bucket-name} prefix: ${backup-path2} - ... # other clusters template: br: sendCredToTikv: true options: - --volume-type=gp3 - --volume-iops=7000 - --volume-throughput=400 toolImage: ${br-image} warmup: sync warmupImage: ${wamrup-image} warmupStrategy: fio

If you grant permissions by associating Pod with IAM, you can create the VolumeRestore CR as follows:

kubectl apply -f restore-fed.yaml

The restore-fed.yaml file has the following content:

--- apiVersion: federation.pingcap.com/v1alpha1 kind: VolumeRestore metadata: name: ${restore-name} annotations: iam.amazonaws.com/role: arn:aws:iam::123456789012:role/role-name spec: clusters: - k8sClusterName: ${k8s-name1} tcName: ${tc-name1} tcNamespace: ${tc-namespace1} backup: s3: provider: aws region: ${region-name} bucket: ${bucket-name} prefix: ${backup-path1} - k8sClusterName: ${k8s-name2} tcName: ${tc-name2} tcNamespace: ${tc-namespace2} backup: s3: provider: aws region: ${region-name} bucket: ${bucket-name} prefix: ${backup-path2} - ... # other clusters template: br: sendCredToTikv: false options: - --volume-type=gp3 - --volume-iops=7000 - --volume-throughput=400 toolImage: ${br-image} warmup: sync warmupImage: ${wamrup-image} warmupStrategy: fsr

If you grant permissions by associating ServiceAccount with IAM, you can create the VolumeRestore CR as follows:

kubectl apply -f restore-fed.yaml

The restore-fed.yaml file has the following content:

--- apiVersion: federation.pingcap.com/v1alpha1 kind: VolumeRestore metadata: name: ${restore-name} spec: clusters: - k8sClusterName: ${k8s-name1} tcName: ${tc-name1} tcNamespace: ${tc-namespace1} backup: s3: provider: aws region: ${region-name} bucket: ${bucket-name} prefix: ${backup-path1} - k8sClusterName: ${k8s-name2} tcName: ${tc-name2} tcNamespace: ${tc-namespace2} backup: s3: provider: aws region: ${region-name} bucket: ${bucket-name} prefix: ${backup-path2} - ... # other clusters template: br: sendCredToTikv: false options: - --volume-type=gp3 - --volume-iops=7000 - --volume-throughput=400 toolImage: ${br-image} serviceAccount: tidb-backup-manager warmup: sync warmupImage: ${warmup-image} warmupStrategy: hybrid

Step 3. View the restore status

After creating the VolumeRestore CR, the restore process automatically start.

To check the restore status, use the following command:

kubectl get vrt -n ${namespace} -o wide

Was this page helpful?