Doc Menu

Restore Data from S3-Compatible Storage Using BR

This document describes how to restore the TiDB cluster data backed up using TiDB Operator in Kubernetes. BR is used to perform the restoration.

The restoration method described in this document is implemented based on Custom Resource Definition (CRD) in TiDB Operator v1.1 or later versions.

This document shows an example in which the backup data stored in the specified path on the Amazon S3 storage is restored to the TiDB cluster.

Three methods to grant AWS account permissions

Refer to Back up Data to Amazon S3 using BR.

Prerequisites

Before you restore data from Amazon S3 storage, you need to grant AWS account permissions. This section describes three methods to grant AWS account permissions.

Grant permissions by importing AccessKey and SecretKey

  1. Download backup-rbac.yaml, and execute the following command to create the role-based access control (RBAC) resources in the test2 namespace:

    kubectl apply -f backup-rbac.yaml -n test2
  2. Create the s3-secret secret which stores the credential used to access the S3-compatible storage:

    kubectl create secret generic s3-secret --from-literal=access_key=xxx --from-literal=secret_key=yyy --namespace=test2
  3. Create the restore-demo2-tidb-secret secret which stores the account and password needed to access the TiDB cluster:

    kubectl create secret generic restore-demo2-tidb-secret --from-literal=password=${password} --namespace=test2

Grant permissions by associating IAM with Pod

  1. Download backup-rbac.yaml, and execute the following command to create the role-based access control (RBAC) resources in the test2 namespace:

    kubectl apply -f backup-rbac.yaml -n test2
  2. Create the restore-demo2-tidb-secret secret which stores the account and password needed to access the TiDB cluster:

    kubectl create secret generic restore-demo2-tidb-secret --from-literal=password=${password} --namespace=test2
  3. Create the IAM role:

    • To create an IAM role for the account, refer to Create an IAM User.
    • Give the required permission to the IAM role you have created (refer to access policies manage for details). Because Restore needs to access the Amazon S3 storage, the IAM here is given the AmazonS3FullAccess permission.
  4. Associate IAM with TiKV Pod:

    • In the restoration process using BR, both the TiKV Pod and the BR Pod need to perform read and write operations on the S3 storage. Therefore, you need to add the annotation to the TiKV Pod to associate the Pod with the IAM role:

      kubectl edit tc demo2 -n test2
    • Find spec.tikv.annotations, append the arn:aws:iam::123456789012:role/user annotation, and then exit. After the TiKV Pod is restarted, check whether the annotation is added to the TiKV Pod.

    Note:

    arn:aws:iam::123456789012:role/user is the IAM role created in Step 4.

Grant permissions by associating IAM with ServiceAccount

  1. Download backup-rbac.yaml, and execute the following command to create the role-based access control (RBAC) resources in the test2 namespace:

    kubectl apply -f backup-rbac.yaml -n test2
  2. Create the restore-demo2-tidb-secret secret which stores the account and password needed to access the TiDB cluster:

    kubectl create secret generic restore-demo2-tidb-secret --from-literal=password=${password} --namespace=test2
  3. Enable the IAM role for the service account on the cluster:

  4. Create the IAM role:

    • Create an IAM role and give the AmazonS3FullAccess permission to the role. Modify Trust relationships of the role. For details, refer to Creating an IAM Role and Policy.
  5. Associate IAM with the ServiceAccount resources:

    kubectl annotate sa tidb-backup-manager -n eks.amazonaws.com/role-arn=arn:aws:iam::123456789012:role/user --namespace=test2
  6. Bind ServiceAccount to TiKV Pod:

    kubectl edit tc demo2 -n test2

    Modify the value of spec.tikv.serviceAccount to tidb-backup-manager. After the TiKV Pod is restarted, check whether the serviceAccountName of the TiKV Pod has changed.

    Note:

    arn:aws:iam::123456789012:role/user is the IAM role created in Step 4.

Required database account privileges

  • The SELECT and UPDATE privileges of the mysql.tidb table: Before and after the restoration, the Restore CR needs a database account with these privileges to adjust the GC time.

Restoration process

  • If you grant permissions by importing AccessKey and SecretKey, create the Restore CR, and restore cluster data as described below:

    kubectl apply -f resotre-aws-s3.yaml

    The content of restore-aws-s3.yaml is as follows:

    ---
    apiVersion: pingcap.com/v1alpha1
    kind: Restore
    metadata:
      name: demo2-restore-s3
      namespace: test2
    spec:
      br:
        cluster: demo2
        clusterNamespace: test2
        # logLevel: info
        # statusAddr: ${status_addr}
        # concurrency: 4
        # rateLimit: 0
        # timeAgo: ${time}
        # checksum: true
        # sendCredToTikv: true
      to:
        host: ${tidb_host}
        port: ${tidb_port}
        user: ${tidb_user}
        secretName: restore-demo2-tidb-secret
      s3:
        provider: aws
        secretName: s3-secret
        region: us-west-1
        bucket: my-bucket
        prefix: my-folder
  • If you grant permissions by associating IAM with Pod, create the Restore CR, and restore cluster data as described below:

    kubectl apply -f restore-aws-s3.yaml

    The content of restore-aws-s3.yaml is as follows:

    ---
    apiVersion: pingcap.com/v1alpha1
    kind: Restore
    metadata:
     name: demo2-restore-s3
     namespace: test2
     annotations:
       iam.amazonaws.com/role: arn:aws:iam::123456789012:role/user
    spec:
     br:
       cluster: demo2
       sendCredToTikv: false
       clusterNamespace: test2
       # logLevel: info
       # statusAddr: ${status_addr}
       # concurrency: 4
       # rateLimit: 0
       # timeAgo: ${time}
       # checksum: true
     to:
       host: ${tidb_host}
       port: ${tidb_port}
       user: ${tidb_user}
       secretName: restore-demo2-tidb-secret
     s3:
       provider: aws
       region: us-west-1
       bucket: my-bucket
       prefix: my-folder
  • If you grant permissions by associating IAM with ServiceAccount, create the Restore CR, and restore cluster data as described below:

    kubectl apply -f restore-aws-s3.yaml

    The content of restore-aws-s3.yaml is as follows:

    ---
    apiVersion: pingcap.com/v1alpha1
    kind: Restore
    metadata:
      name: demo2-restore-s3
      namespace: test2
    spec:
      serviceAccount: tidb-backup-manager
      br:
        cluster: demo2
        sendCredToTikv: false
        clusterNamespace: test2
        # logLevel: info
        # statusAddr: ${status_addr}
        # concurrency: 4
        # rateLimit: 0
        # timeAgo: ${time}
        # checksum: true
      to:
        host: ${tidb_host}
        port: ${tidb_port}
        user: ${tidb_user}
        secretName: restore-demo2-tidb-secret
      s3:
        provider: aws
        region: us-west-1
        bucket: my-bucket
        prefix: my-folder

After creating the Restore CR, execute the following command to check the restoration status:

kubectl get rt -n test2 -o wide

More Restore CR fields are described as follows:

  • .spec.metadata.namespace: the namespace where the Restore CR is located.

  • .spec.to.host: the address of the TiDB cluster to be restored.

  • .spec.to.port: the port of the TiDB cluster to be restored.

  • .spec.to.user: the accessing user of the TiDB cluster to be restored.

  • .spec.to.tidbSecretName: the secret of the user password of the .spec.to.tidbSecretName TiDB cluster.

  • .spec.to.tlsClientSecretName: the secret of the certificate used during the restoration.

    If TLS is enabled for the TiDB cluster, but you do not want to restore data using the ${cluster_name}-cluster-client-secret as described in Enable TLS between TiDB Components, you can use the .spec.from.tlsClient.tlsSecret parameter to specify a secret for the restoration. To generate the secret, run the following command:

    kubectl create secret generic ${secret_name} --namespace=${namespace} --from-file=tls.crt=${cert_path} --from-file=tls.key=${key_path} --from-file=ca.crt=${ca_path}
  • .spec.tableFilter: BR only restores tables that match the table filter rules. This field can be ignored by default. If the field is not configured, BR restores all schemas except the system schemas.

    Note:

    To use the table filter to exclude db.table, you need to add the *.* rule to include all tables first. For example:

    tableFilter:
    - "*.*"
    - "!db.table"

In the examples above, some parameters in .spec.br can be ignored, such as logLevel, statusAddr, concurrency, rateLimit, checksum, timeAgo, and sendCredToTikv.

  • .spec.br.cluster: The name of the cluster to be backed up.
  • .spec.br.clusterNamespace: The namespace of the cluster to be backed up.
  • .spec.br.logLevel: The log level (info by default).
  • .spec.br.statusAddr: The listening address through which BR provides statistics. If not specified, BR does not listen on any status address by default.
  • .spec.br.concurrency: The number of threads used by each TiKV process during backup. Defaults to 4 for backup and 128 for restore.
  • .spec.br.rateLimit: The speed limit, in MB/s. If set to 4, the speed limit is 4 MB/s. The speed limit is not set by default.
  • .spec.br.checksum: Whether to verify the files after the backup is completed. Defaults to true.
  • .spec.br.timeAgo: Backs up the data before timeAgo. If the parameter value is not specified (empty by default), it means backing up the current data. It supports data formats such as "1.5h" and "2h45m". See ParseDuration for more information.
  • .spec.br.sendCredToTikv: Whether the BR process passes its GCP privileges to the TiKV process. Defaults to true.

Troubleshooting

If you encounter any problem during the restore process, refer to Common Deployment Failures.