- Key Features
- Horizontal Scalability
- MySQL Compatible Syntax
- Replicate from and to MySQL
- Distributed Transactions with Strong Consistency
- Cloud Native Architecture
- Minimize ETL with HTAP
- Fault Tolerance & Recovery with Raft
- Automatic Rebalancing
- Deployment and Orchestration with Ansible, Kubernetes, Docker
- JSON Support
- Spark Integration
- Read Historical Data Without Restoring from Backup
- Fast Import and Restore of Data
- Hybrid of Column and Row Storage
- SQL Plan Management
- Open Source
- Online Schema Changes
- Key Features
- Get Started
- From Binary Tarball
- Orchestrated Deployment
- Geographic Redundancy
- SQL Language Structure
- Data Types
- Numeric Types
- Date and Time Types
- String Types
- Functions and Operators
- Function and Operator Reference
- Type Conversion in Expression Evaluation
- Control Flow Functions
- String Functions
- Numeric Functions and Operators
- Date and Time Functions
- Bit Functions and Operators
- Cast Functions and Operators
- Encryption and Compression Functions
- Information Functions
- JSON Functions
- Aggregate (GROUP BY) Functions
- Miscellaneous Functions
- Precision Math
- SQL Statements
ADMIN CANCEL DDL
ADMIN CHECKSUM TABLE
ADMIN CHECK [TABLE|INDEX]
ADMIN SHOW DDL [JOBS|QUERIES]
CREATE TABLE LIKE
SET [NAMES|CHARACTER SET]
SET [GLOBAL|SESSION] <variable>
SHOW CHARACTER SET
SHOW [FULL] COLUMNS FROM
SHOW CREATE TABLE
SHOW [FULL] FIELDS FROM
SHOW INDEXES [FROM|IN]
SHOW INDEX [FROM|IN]
SHOW KEYS [FROM|IN]
SHOW [FULL] PROCESSSLIST
SHOW [FULL] TABLES
SHOW TABLE STATUS
SHOW [GLOBAL|SESSION] VARIABLES
- System Databases
- Key Monitoring Metrics
- Best Practices
- TiDB Binlog
- TiDB Lightning
- All Releases
TiDB Lightning is a tool used for fast full import of large amounts of data into a TiDB cluster. Currently, TiDB Lightning supports reading SQL dump exported via Mydumper or CSV data source. You can use it in the following two scenarios:
- Importing large amounts of new data quickly
- Restore all backup data
The TiDB Lightning tool set consists of two components:
tidb-lightning(the "front end") reads the data source and imports the database structure into the TiDB cluster, and also transforms the data into Key-Value (KV) pairs and sends them to
tikv-importer(the "back end") combines and sorts the KV pairs and then imports these sorted pairs as a whole into the TiKV cluster.
The complete import process is as follows:
tidb-lightningswitches the TiKV cluster to "import mode", which optimizes the cluster for writing and disables automatic compaction.
tidb-lightningcreates the skeleton of all tables from the data source.
Each table is split into multiple continuous batches, so that data from a huge table (200 GB+) can be delivered incrementally.
For each batch,
tikv-importervia gRPC to create engine files to store KV pairs.
tidb-lightningthen reads the data source in parallel, transforms each row into KV pairs according to the TiDB rules, and sends them to
tikv-importer's engine files.
Once a complete engine file is written,
tikv-importerdivides and schedules these data and imports them into the target TiKV cluster.
There are two kinds of engine files: data engines and index engines, each corresponding to two kinds of KV pairs: the row data and secondary indices. Normally, the row data are entirely sorted in the data source, while the secondary indices are out of order. Because of this, the data engines are uploaded as soon as a batch is completed, while the index engines are imported only after all batches of the entire table are encoded.
After all engines associated to a table are imported,
tidb-lightningperforms a checksum comparison between the local data source and those calculated from the cluster, to ensure there is no data corruption in the process; tells TiDB to
ANALYZEall imported tables, to prepare for optimal query planning; and adjusts the
AUTO_INCREMENTvalue so future insertions will not cause conflict.
The auto-increment ID of a table is computed by the estimated upper bound of the number of rows, which is proportional to the total file size of the data files of the table. Therefore, the final auto-increment ID is often much larger than the actual number of rows. This is expected since in TiDB auto-increment is not necessarily allocated sequentially.
tidb-lightningswitches the TiKV cluster back to "normal mode", so the cluster resumes normal services.