Restore Backup Data from Google Cloud Storage (GCS) Using TiDB Lightning
This document describes how to use TiDB Lightning to restore backup data from Google Cloud Storage (GCS) to a TiDB cluster. TiDB Lightning is a tool for fast full data import into a TiDB cluster. This document uses the physical import mode. For detailed usage and configuration items of TiDB Lightning, refer to the official documentation.
The following example shows how to restore backup data from GCS to a TiDB cluster.
Prepare a node pool for TiDB Lightning
You can run TiDB Lightning in an existing node pool or create a dedicated node pool. The following example shows how to create a new node pool. Replace the variables as needed:
${clusterName}
: GKE cluster name
gcloud container node-pools create lightning \
--cluster ${clusterName} \
--machine-type n2-standard-4 \
--num-nodes=1 \
--node-labels=dedicated=lightning
Deploy the TiDB Lightning job
Create a credential ConfigMap
Save the service account key
file downloaded from the Google Cloud Console as google-credentials.json
, and then create a ConfigMap with the following command:
kubectl -n ${namespace} create configmap google-credentials --from-file=google-credentials.json
Configure the TiDB Lightning job
The following is a sample configuration file (lightning_job.yaml
) for the TiDB Lightning job. This file defines the necessary resources and configurations for the job. Replace the variables with your specific values as needed:
${name}
: Job name${namespace}
: Kubernetes namespace${version}
: TiDB Lightning image version${storageClassName}
: Storage class name${storage}
: Storage size- For TiDB Lightning parameters, refer to TiDB Lightning Configuration.
# lightning_job.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ${name}-sorted-kv
namespace: ${namespace}
spec:
storageClassName: ${storageClassName}
accessModes:
- ReadWriteOnce
resources:
requests:
storage: ${storage}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: ${name}
namespace: ${namespace}
data:
config-file: |
[lightning]
level = "info"
[checkpoint]
enable = true
[tidb]
host = "basic-tidb"
port = 4000
user = "root"
password = ""
status-port = 10080
pd-addr = "basic-pd:2379"
---
apiVersion: batch/v1
kind: Job
metadata:
name: ${name}
namespace: ${namespace}
labels:
app.kubernetes.io/component: lightning
spec:
template:
spec:
nodeSelector:
dedicated: lightning
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/component
operator: In
values:
- lightning
topologyKey: kubernetes.io/hostname
containers:
- name: tidb-lightning
image: pingcap/tidb-lightning:${version}
command:
- /bin/sh
- -c
- |
/tidb-lightning \
--status-addr=0.0.0.0:8289 \
--backend=local \
--sorted-kv-dir=/var/lib/sorted-kv \
--d=gcs://external/testfolder?credentials-file=/etc/config/google-credentials.json \
--config=/etc/tidb-lightning/tidb-lightning.toml \
--log-file="-"
volumeMounts:
- name: config
mountPath: /etc/tidb-lightning
- name: sorted-kv
mountPath: /var/lib/sorted-kv
- name: google-credentials
mountPath: /etc/config
volumes:
- name: config
configMap:
name: ${name}
items:
- key: config-file
path: tidb-lightning.toml
- name: sorted-kv
persistentVolumeClaim:
claimName: ${name}-sorted-kv
- name: google-credentials
configMap:
name: google-credentials
restartPolicy: Never
backoffLimit: 0
Create the TiDB Lightning job
Run the following commands to create the TiDB Lightning job:
export name=lightning
export version=v8.5.1
export namespace=tidb-cluster
export storageClassName=<your-storage-class>
export storage=250G
envsubst < lightning_job.yaml | kubectl apply -f -
Check the TiDB Lightning job status
Run the following command to check the Pod status of the TiDB Lightning job:
kubectl -n ${namespace} get pod ${name}
View TiDB Lightning job logs
Run the following command to view the logs of the TiDB Lightning job:
kubectl -n ${namespace} logs pod ${name}