Restore Backup Data from Google Cloud Storage (GCS) Using TiDB Lightning
This document describes how to use TiDB Lightning to restore backup data from Google Cloud Storage (GCS) to a TiDB cluster. TiDB Lightning is a tool for fast full data import into a TiDB cluster. This document uses the physical import mode. For detailed usage and configuration items of TiDB Lightning, refer to the official documentation.
The following example shows how to restore backup data from GCS to a TiDB cluster.
Prepare a node pool for TiDB Lightning
You can run TiDB Lightning in an existing node pool or create a dedicated node pool. The following example shows how to create a new node pool. Replace the variables as needed:
${clusterName}: GKE cluster name
gcloud container node-pools create lightning \
    --cluster ${clusterName} \
    --machine-type n2-standard-4 \
    --num-nodes=1 \
    --node-labels=dedicated=lightning
Deploy the TiDB Lightning job
Create a credential ConfigMap
Save the service account key file downloaded from the Google Cloud Console as google-credentials.json, and then create a ConfigMap with the following command:
kubectl -n ${namespace} create configmap google-credentials --from-file=google-credentials.json
Configure the TiDB Lightning job
The following is a sample configuration file (lightning_job.yaml) for the TiDB Lightning job. This file defines the necessary resources and configurations for the job. Replace the variables with your specific values as needed:
${name}: Job name${namespace}: Kubernetes namespace${version}: TiDB Lightning image version${storageClassName}: Storage class name${storage}: Storage size- For TiDB Lightning parameters, refer to TiDB Lightning Configuration.
 
# lightning_job.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ${name}-sorted-kv
  namespace: ${namespace}
spec:
  storageClassName: ${storageClassName}
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: ${storage}
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: ${name}
  namespace: ${namespace}
data:
  config-file: |
    [lightning]
    level = "info"
    
    [checkpoint]
    enable = true
  
    [tidb]
    host = "basic-tidb"
    port = 4000
    user = "root"
    password = ""
    status-port = 10080
    pd-addr = "basic-pd:2379"
---
apiVersion: batch/v1
kind: Job
metadata:
  name: ${name}
  namespace: ${namespace}
  labels:
    app.kubernetes.io/component: lightning
spec:
  template:
    spec:
      nodeSelector:
        dedicated: lightning
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app.kubernetes.io/component
                operator: In
                values:
                - lightning
            topologyKey: kubernetes.io/hostname
      containers:
        - name: tidb-lightning
          image: pingcap/tidb-lightning:${version}
          command:
            - /bin/sh
            - -c
            - |
              /tidb-lightning \
                  --status-addr=0.0.0.0:8289 \
                  --backend=local \
                  --sorted-kv-dir=/var/lib/sorted-kv \
                  --d=gcs://external/testfolder?credentials-file=/etc/config/google-credentials.json \
                  --config=/etc/tidb-lightning/tidb-lightning.toml \
                  --log-file="-"
          volumeMounts:
            - name: config
              mountPath: /etc/tidb-lightning
            - name: sorted-kv
              mountPath: /var/lib/sorted-kv
            - name: google-credentials
              mountPath: /etc/config
      volumes:
        - name: config
          configMap:
            name: ${name}
            items:
            - key: config-file
              path: tidb-lightning.toml
        - name: sorted-kv
          persistentVolumeClaim:
            claimName: ${name}-sorted-kv
        - name: google-credentials
          configMap:
            name: google-credentials
      restartPolicy: Never
  backoffLimit: 0
Create the TiDB Lightning job
Run the following commands to create the TiDB Lightning job:
export name=lightning
export version=v8.5.1
export namespace=tidb-cluster
export storageClassName=<your-storage-class>
export storage=250G
envsubst < lightning_job.yaml | kubectl apply -f -
Check the TiDB Lightning job status
Run the following command to check the Pod status of the TiDB Lightning job:
kubectl -n ${namespace} get pod ${name}
View TiDB Lightning job logs
Run the following command to view the logs of the TiDB Lightning job:
kubectl -n ${namespace} logs pod ${name}