Restore Backup Data from Amazon S3-Compatible Storage Using TiDB Lightning
This document describes how to use TiDB Lightning to restore backup data from Amazon S3-compatible storage to a TiDB cluster. TiDB Lightning is a tool for fast full data import into a TiDB cluster. This document uses the physical import mode. For detailed usage and configuration items of TiDB Lightning, refer to the official documentation.
The following example shows how to restore backup data from Amazon S3-compatible storage to a TiDB cluster.
Prepare a node pool for TiDB Lightning
You can run TiDB Lightning in an existing node pool or create a dedicated node pool. The following is a sample configuration for creating a new node pool. Replace the variables with your specific values as needed:
${clusterName}
: EKS cluster name
# eks_lightning.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: ${clusterName}
region: us-west-2
availabilityZones: ['us-west-2a', 'us-west-2b', 'us-west-2c']
nodeGroups:
- name: lightning
instanceType: c5.xlarge
desiredCapacity: 1
privateNetworking: true
availabilityZones: ["us-west-2a"]
labels:
dedicated: lightning
Run the following command to create the node pool:
eksctl create nodegroup -f eks_lightning.yaml
Deploy the TiDB Lightning job
This section describes how to configure, deploy, and monitor the TiDB Lightning job.
Configure the TiDB Lightning job
The following is a sample configuration file (lightning_job.yaml
) for the TiDB Lightning job. Replace the variables with your specific values as needed:
${name}
: Job name${namespace}
: Kubernetes namespace${version}
: TiDB Lightning image version${storageClassName}
: Storage class name${storage}
: Storage size- For TiDB Lightning parameters, refer to TiDB Lightning Configuration.
# lightning_job.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ${name}-sorted-kv
namespace: ${namespace}
spec:
storageClassName: ${storageClassName}
accessModes:
- ReadWriteOnce
resources:
requests:
storage: ${storage}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: ${name}
namespace: ${namespace}
data:
config-file: |
[lightning]
level = "info"
[checkpoint]
enable = true
[tidb]
host = "basic-tidb"
port = 4000
user = "root"
password = ""
status-port = 10080
pd-addr = "basic-pd:2379"
---
apiVersion: batch/v1
kind: Job
metadata:
name: ${name}
namespace: ${namespace}
labels:
app.kubernetes.io/component: lightning
spec:
template:
spec:
nodeSelector:
dedicated: lightning
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/component
operator: In
values:
- lightning
topologyKey: kubernetes.io/hostname
containers:
- name: tidb-lightning
image: pingcap/tidb-lightning:${version}
command:
- /bin/sh
- -c
- |
/tidb-lightning \
--status-addr=0.0.0.0:8289 \
--backend=local \
--sorted-kv-dir=/var/lib/sorted-kv \
--d=s3://external/testfolder \
--config=/etc/tidb-lightning/tidb-lightning.toml \
--log-file="-"
env:
- name: AWS_REGION
value: ${AWS_REGION}
- name: AWS_ACCESS_KEY_ID
value: ${AWS_ACCESS_KEY_ID}
- name: AWS_SECRET_ACCESS_KEY
value: ${AWS_SECRET_ACCESS_KEY}
- name: AWS_SESSION_TOKEN
value: ${AWS_SESSION_TOKEN}
volumeMounts:
- name: config
mountPath: /etc/tidb-lightning
- name: sorted-kv
mountPath: /var/lib/sorted-kv
volumes:
- name: config
configMap:
name: ${name}
items:
- key: config-file
path: tidb-lightning.toml
- name: sorted-kv
persistentVolumeClaim:
claimName: ${name}-sorted-kv
restartPolicy: Never
backoffLimit: 0
Create the TiDB Lightning job
Run the following commands to create the TiDB Lightning job:
export name=lightning
export version=v8.5.1
export namespace=tidb-cluster
export storageClassName=<your-storage-class>
export storage=250G
export AWS_REGION=us-west-2
export AWS_ACCESS_KEY_ID=<your-access-key-id>
export AWS_SECRET_ACCESS_KEY=<your-secret-access-key>
export AWS_SESSION_TOKEN=<your-session-token> # Optional
envsubst < lightning_job.yaml | kubectl apply -f -
Check the TiDB Lightning job status
Run the following command to check the status of the Pod associated with the TiDB Lightning job:
kubectl -n ${namespace} get pod ${name}
View TiDB Lightning job logs
Run the following command to retrieve and view the logs of the TiDB Lightning job:
kubectl -n ${namespace} logs pod ${name}