This document describes the hardware requirements of TiDB Lightning using the Local-backend, and how to deploy it manually.
If Local-backend is used for data import, during the import process, the cluster cannot provide services. If you do not want the TiDB services to be impacted, perform the data import according to TiDB Lightning TiDB-backend.
Before starting TiDB Lightning, note that:
tidb-lightningcrashes, the cluster is left in "import mode". Forgetting to switch back to "normal mode" can lead to a high amount of uncompacted data on the TiKV cluster, and cause abnormally high CPU usage and stall. You can manually switch the cluster back to "normal mode" via the
TiDB Lightning is required to have the following privileges in the downstream TiDB:
Privilege Scope SELECT Tables INSERT Tables UPDATE Tables DELETE Tables CREATE Databases, tables DROP Databases, tables ALTER Tables
checksumconfiguration item of TiDB Lightning is set to
true, then the admin user privileges in the downstream TiDB need to be granted to TiDB Lightning.
tidb-lightning is a resource-intensive program. It is recommended to deploy it as follows.
- 32+ logical cores CPU
- 20GB+ memory
- An SSD large enough to store the entire data source, preferring higher read speed
- 10 Gigabit network card (capable of transferring at ≥1 GB/s)
tidb-lightningfully consumes all CPU cores when running, and deploying on a dedicated machine is highly recommended. If not possible,
tidb-lightningcould be deployed together with other components like
tidb-server, and the CPU usage could be limited via the
tidb-lightningis a CPU intensive program. In an environment with mixed components, the resources allocated to
tidb-lightningmust be limited. Otherwise, other components might not be able to run. It is recommended to set the
region-concurrencyto 75% of CPU logical cores. For instance, if the CPU has 32 logical cores, you can set the
Additionally, the target TiKV cluster should have enough space to absorb the new data. Besides the standard requirements, the total free space of the target TiKV cluster should be larger than Size of data source × Number of replicas × 2.
With the default replica count of 3, this means the total free space should be at least 6 times the size of data source.
dumpling tool to export data from MySQL by using the following command:
./bin/dumpling -h 127.0.0.1 -P 3306 -u root -t 16 -F 256MB -B test -f 'test.t' -o /data/my_database/
In this command,
-B test: means the data is exported from the
-f test.t: means only the
test.t2tables are exported.
-t 16: means 16 threads are used to export the data.
-F 256MB: means a table is partitioned into chunks and one chunk is 256 MB.
If the data source consists of CSV files, see CSV support for configuration.
This section describes how to deploy TiDB Lightning manually.
Before importing data, you need to have a deployed TiDB cluster. It is highly recommended to use the latest stable version.
You can find deployment instructions in TiDB Quick Start Guide.
Refer to the TiDB enterprise tools download page to download the TiDB Lightning package.
TiDB Lightning is compatible with TiDB clusters of earlier versions. It is recommended that you download the latest stable version of the TiDB Lightning installation package.
bin/tidb-lightning-ctlfrom the tool set.
Mount the data source onto the same machine.
tidb-lightning.toml. For configurations that do not appear in the template below, TiDB Lightning writes a configuration error to the log file and exits.
sorted-kv-dirmust be an empty directory and the disk where the directory is located must have a lot of free space.
[lightning] # The concurrency number of data. It is set to the number of logical CPU # cores by default. When deploying together with other components, you can # set it to 75% of the size of logical CPU cores to limit the CPU usage. # region-concurrency = # Logging level = "info" file = "tidb-lightning.log" [tikv-importer] # Sets the backend to the "local" mode. backend = "local" # Sets the directory of temporary local storage. sorted-kv-dir = "/mnt/ssd/sorted-kv-dir" [mydumper] # Local source data directory data-source-dir = "/data/my_database" [tidb] # Configuration of any TiDB server from the cluster host = "172.16.31.1" port = 4000 user = "root" password = "" # Table schema information is fetched from TiDB via this status-port. status-port = 10080 # An address of pd-server. pd-addr = "172.16.31.4:2379"
The above only shows the essential settings. See the Configuration section for the full list of settings.
nohup ./tidb-lightning -config tidb-lightning.toml > nohup.out &
You can upgrade TiDB Lightning by replacing the binaries alone. No further configuration is needed. See FAQ for the detailed instructions of restarting TiDB Lightning.
If an import task is running, we recommend you to wait until it finishes before upgrading TiDB Lightning. Otherwise, there might be chances that you need to reimport from scratch, because there is no guarantee that checkpoints work across versions.