- Introduction
- Concepts
- Architecture
- Key Features
- Horizontal Scalability
- MySQL Compatible Syntax
- Replicate from and to MySQL
- Distributed Transactions with Strong Consistency
- Cloud Native Architecture
- Minimize ETL with HTAP
- Fault Tolerance & Recovery with Raft
- Automatic Rebalancing
- Deployment and Orchestration with Ansible, Kubernetes, Docker
- JSON Support
- Spark Integration
- Read Historical Data Without Restoring from Backup
- Fast Import and Restore of Data
- Hybrid of Column and Row Storage
- SQL Plan Management
- Open Source
- Online Schema Changes
- How-to
- Get Started
- Deploy
- Hardware Recommendations
- From Binary Tarball
- Orchestrated Deployment
- Geographic Redundancy
- Data Migration with Ansible
- Configure
- Secure
- Transport Layer Security (TLS)
- Generate Self-signed Certificates
- Monitor
- Migrate
- Maintain
- Scale
- Upgrade
- Troubleshoot
- Reference
- SQL
- MySQL Compatibility
- SQL Language Structure
- Data Types
- Functions and Operators
- Function and Operator Reference
- Type Conversion in Expression Evaluation
- Operators
- Control Flow Functions
- String Functions
- Numeric Functions and Operators
- Date and Time Functions
- Bit Functions and Operators
- Cast Functions and Operators
- Encryption and Compression Functions
- Information Functions
- JSON Functions
- Aggregate (GROUP BY) Functions
- Miscellaneous Functions
- Precision Math
- SQL Statements
ADD COLUMN
ADD INDEX
ADMIN
ADMIN CANCEL DDL
ADMIN CHECKSUM TABLE
ADMIN CHECK [TABLE|INDEX]
ADMIN SHOW DDL [JOBS|QUERIES]
ALTER DATABASE
ALTER TABLE
ALTER USER
ANALYZE TABLE
BEGIN
CHANGE COLUMN
COMMIT
CREATE DATABASE
CREATE INDEX
CREATE TABLE LIKE
CREATE TABLE
CREATE USER
DEALLOCATE
DELETE
DESC
DESCRIBE
DO
DROP COLUMN
DROP DATABASE
DROP INDEX
DROP TABLE
DROP USER
EXECUTE
EXPLAIN ANALYZE
EXPLAIN
FLUSH PRIVILEGES
FLUSH STATUS
FLUSH TABLES
GRANT <privileges>
INSERT
KILL [TIDB]
LOAD DATA
LOAD STATS
MODIFY COLUMN
PREPARE
RENAME INDEX
RENAME TABLE
REPLACE
REVOKE <privileges>
ROLLBACK
SELECT
SET [NAMES|CHARACTER SET]
SET PASSWORD
SET TRANSACTION
SET [GLOBAL|SESSION] <variable>
SHOW CHARACTER SET
SHOW COLLATION
SHOW [FULL] COLUMNS FROM
SHOW CREATE TABLE
SHOW DATABASES
SHOW ENGINES
SHOW ERRORS
SHOW [FULL] FIELDS FROM
SHOW GRANTS
SHOW INDEXES [FROM|IN]
SHOW INDEX [FROM|IN]
SHOW KEYS [FROM|IN]
SHOW PRIVILEGES
SHOW [FULL] PROCESSSLIST
SHOW SCHEMAS
SHOW STATUS
SHOW [FULL] TABLES
SHOW TABLE STATUS
SHOW [GLOBAL|SESSION] VARIABLES
SHOW WARNINGS
START TRANSACTION
TRACE
TRUNCATE
UPDATE
USE
- Constraints
- Generated Columns
- Character Set
- Configuration
- Security
- Transactions
- System Databases
- Errors Codes
- Supported Client Drivers
- Garbage Collection (GC)
- Performance
- Key Monitoring Metrics
- Alert Rules
- Best Practices
- TiSpark
- TiDB Binlog
- Tools
- Overview
- Use Cases
- Download
- Mydumper
- Syncer
- Loader
- TiDB Data Migration
- TiDB Lightning
- sync-diff-inspector
- PD Control
- PD Recover
- TiKV Control
- TiDB Control
- FAQs
- Support
- Contribute
- Adopters
- Releases
- All Releases
- v2.1
- v2.0
- v1.0
- Glossary
You are viewing the documentation of an older version of the TiDB database (TiDB v2.1).
TiDB Lightning Deployment
This document describes the hardware requirements of TiDB Lightning on separate deployment and mixed deployment, and how to deploy it using TiDB Ansible or manually.
Notes
Before starting TiDB Lightning, note that:
During the import process, the cluster cannot provide normal services.
If
tidb-lightning
crashes, the cluster is left in "import mode". Forgetting to switch back to "normal mode" can lead to a high amount of uncompacted data on the TiKV cluster, and cause abnormally high CPU usage and stall. You can manually switch the cluster back to "normal mode" via thetidb-lightning-ctl
tool:bin/tidb-lightning-ctl -switch-mode=normal
TiDB Lightning is required to have the following privileges in the downstream TiDB:
Privilege Scope SELECT Tables INSERT Tables UPDATE Tables DELETE Tables CREATE Databases, tables DROP Databases, tables ALTER Tables If the
checksum
configuration item of TiDB Lightning is set totrue
, then the admin user privileges in the downstream TiDB need to be granted to TiDB Lightning.
Hardware requirements
tidb-lightning
and tikv-importer
are both resource-intensive programs. It is recommended to deploy them into two separate machines.
To achieve the best performance, it is recommended to use the following hardware configuration:
tidb-lightning
:- 32+ logical cores CPU
- An SSD large enough to store the entire data source, preferring higher read speed
- 10 Gigabit network card (capable of transferring at ≥300 MB/s)
tidb-lightning
fully consumes all CPU cores when running, and deploying on a dedicated machine is highly recommended. If not possible,tidb-lightning
could be deployed together with other components liketidb-server
, and the CPU usage could be limited via theregion-concurrency
setting.
tikv-importer
:- 32+ logical cores CPU
- 40 GB+ memory
- 1 TB+ SSD, preferring higher IOPS (≥ 8000 is recommended)
- The disk should be larger than the total size of the top N tables, where N = max(index-concurrency, table-concurrency).
- 10 Gigabit network card (capable of transferring at ≥300 MB/s)
tikv-importer
fully consumes all CPU, disk I/O and network bandwidth when running, and deploying on a dedicated machine is strongly recommended.
If you have sufficient machines, you can deploy multiple Lightning/Importer servers, with each working on a distinct set of tables, to import the data in parallel.
tidb-lightning
is a CPU intensive program. In an environment with mixed components, the resources allocated totidb-lightning
must be limited. Otherwise, other components might not be able to run. It is recommended to set theregion-concurrency
to 75% of CPU logical cores. For instance, if the CPU has 32 logical cores, you can set theregion-concurrency
to 24.tikv-importer
stores intermediate data on the RAM to speed up the import process. The typical memory usage can be calculated by using (max-open-engines
×write-buffer-size
× 2) + (num-import-jobs
×region-split-size
× 2). If the speed of writing to disk is slow, the memory usage could be even higher due to buffering.
Additionally, the target TiKV cluster should have enough space to absorb the new data. Besides the standard requirements, the total free space of the target TiKV cluster should be larger than Size of data source × Number of replicas × 2.
With the default replica count of 3, this means the total free space should be at least 6 times the size of data source.
Export data
Use the mydumper
tool to export data from MySQL by using the following command:
./bin/mydumper -h 127.0.0.1 -P 3306 -u root -t 16 -F 256 -B test -T t1,t2 --skip-tz-utc -o /data/my_database/
In this command,
-B test
: means the data is exported from thetest
database.-T t1,t2
: means only thet1
andt2
tables are exported.-t 16
: means 16 threads are used to export the data.-F 256
: means a table is partitioned into chunks and one chunk is 256 MB.--skip-tz-utc
: the purpose of adding this parameter is to ignore the inconsistency of time zone setting between MySQL and the data exporting machine, and to disable automatic conversion.
If the data source consists of CSV files, see CSV support for configuration.
Deploy TiDB Lightning
This section describes two deployment methods of TiDB Lightning:
Deploy TiDB Lightning using TiDB Ansible
You can deploy TiDB Lightning using TiDB Ansible together with the deployment of the TiDB cluster itself using TiDB Ansible.
Edit
inventory.ini
to add the addresses of thetidb-lightning
andtikv-importer
servers.... [importer_server] 192.168.20.9 [lightning_server] 192.168.20.10 ...
Configure these tools by editing the settings under
group_vars/*.yml
.group_vars/all.yml
... # The listening port of tikv-importer. Should be open to the tidb-lightning server. tikv_importer_port: 8287 ...
group_vars/lightning_server.yml
--- dummy: # The listening port for metrics gathering. Should be open to the monitoring servers. tidb_lightning_pprof_port: 8289 # The file path that tidb-lightning reads the data source (Mydumper SQL dump or CSV) from. data_source_dir: "{{ deploy_dir }}/mydumper"
group_vars/importer_server.yml
--- dummy: # The file path to store engine files. Should reside on a partition with a large capacity. import_dir: "{{ deploy_dir }}/data.import"
Deploy the cluster.
ansible-playbook bootstrap.yml ansible-playbook deploy.yml
Mount the data source to the path specified in the
data_source_dir
setting.Log in to the
tikv-importer
server, and manually run the following command to start Importer.scripts/start_importer.sh
Log in to the
tidb-lightning
server, and manually run the following command to start Lightning and import the data into the TiDB cluster.scripts/start_lightning.sh
After completion, run
scripts/stop_importer.sh
on thetikv-importer
server to stop Importer.
Deploy TiDB Lightning manually
Step 1: Deploy a TiDB cluster
Before importing data, you need to have a deployed TiDB cluster, with the cluster version 2.0.9 or above. It is highly recommended to use the latest version.
You can find deployment instructions in TiDB Quick Start Guide.
Step 2: Download the TiDB Lightning installation package
Refer to the TiDB enterprise tools download page to download the TiDB Lightning package (choose the same version as that of the TiDB cluster).
Step 3: Start tikv-importer
Upload
bin/tikv-importer
from the installation package.Configure
tikv-importer.toml
.# TiKV Importer configuration file template # Log file log-file = "tikv-importer.log" # Log level: trace, debug, info, warn, error, off. log-level = "info" [server] # The listening address of tikv-importer. tidb-lightning needs to connect to # this address to write data. addr = "192.168.20.10:8287" [metric] # The Prometheus client push job name. job = "tikv-importer" # The Prometheus client push interval. interval = "15s" # The Prometheus Pushgateway address. address = "" [import] # The directory to store engine files. import-dir = "/mnt/ssd/data.import/"
The above only shows the essential settings. See the Configuration section for the full list of settings.
Run
tikv-importer
.nohup ./tikv-importer -C tikv-importer.toml > nohup.out &
Step 4: Start tidb-lightning
Upload
bin/tidb-lightning
andbin/tidb-lightning-ctl
from the tool set.Mount the data source onto the same machine.
Configure
tidb-lightning.toml
. For configurations that do not appear in the template below, TiDB Lightning writes a configuration error to the log file and exits.[lightning] # The concurrency number of data. It is set to the number of logical CPU # cores by default. When deploying together with other components, you can # set it to 75% of the size of logical CPU cores to limit the CPU usage. # region-concurrency = # Logging level = "info" file = "tidb-lightning.log" [tikv-importer] # The listening address of tikv-importer. Change it to the actual address. addr = "172.16.31.10:8287" [mydumper] # mydumper local source data directory data-source-dir = "/data/my_database" [tidb] # Configuration of any TiDB server from the cluster host = "172.16.31.1" port = 4000 user = "root" password = "" # Table schema information is fetched from TiDB via this status-port. status-port = 10080
The above only shows the essential settings. See the Configuration section for the full list of settings.
Run
tidb-lightning
.nohup ./tidb-lightning -config tidb-lightning.toml > nohup.out &