TiDB Lightning Requirements for the Target Database
Before using TiDB Lightning, you need to check whether the environment meets the requirements. This helps reduce errors during import and ensures import success.
Privileges of the target database
Based on the import mode and features enabled, the target database users should be granted with different privileges. The following table provides a reference.
Feature | Scope | Required privilege | Remarks | |
Mandatory | Basic functions | Target table | CREATE, SELECT, INSERT, UPDATE, DELETE, DROP, ALTER | DROP is required only when tidb-lightning-ctl runs the checkpoint-destroy-all command |
Target database | CREATE | |||
Mandatory | Logical Import Mode | information_schema.columns | SELECT | |
Physical Import Mode | mysql.tidb | SELECT | ||
- | SUPER | |||
- | RESTRICTED_VARIABLES_ADMIN,RESTRICTED_TABLES_ADMIN | Required when the target TiDB enables SEM | ||
Recommended | Conflict detection, max-error | Schema configured for lightning.task-info-schema-name | SELECT, INSERT, UPDATE, DELETE, CREATE, DROP | If not required, the value must be set to "" |
Optional | Parallel import | Schema configured for lightning.meta-schema-name | SELECT, INSERT, UPDATE, DELETE, CREATE, DROP | If not required, the value must be set to "" |
Optional | checkpoint.driver = "mysql" | checkpoint.schema setting | SELECT,INSERT,UPDATE,DELETE,CREATE,DROP | Required when checkpoint information is stored in databases, instead of files |
Storage space of the target database
The target TiKV cluster must have enough disk space to store the imported data. In addition to the standard hardware requirements, the storage space of the target TiKV cluster must be larger than the size of the data source x the number of replicas x 2. For example, if the cluster uses 3 replicas by default, the target TiKV cluster must have a storage space larger than 6 times the size of the data source. The formula has x 2 because:
- Indexes might take extra space.
- RocksDB has a space amplification effect.
It is difficult to calculate the exact data volume exported by Dumpling from MySQL. However, you can estimate the data volume by using the following SQL statement to summarize the DATA_LENGTH
field in the information_schema.tables table:
-- Calculate the size of all schemas
SELECT
TABLE_SCHEMA,
FORMAT_BYTES(SUM(DATA_LENGTH)) AS 'Data Size',
FORMAT_BYTES(SUM(INDEX_LENGTH)) 'Index Size'
FROM
information_schema.tables
GROUP BY
TABLE_SCHEMA;
-- Calculate the 5 largest tables
SELECT
TABLE_NAME,
TABLE_SCHEMA,
FORMAT_BYTES(SUM(data_length)) AS 'Data Size',
FORMAT_BYTES(SUM(index_length)) AS 'Index Size',
FORMAT_BYTES(SUM(data_length+index_length)) AS 'Total Size'
FROM
information_schema.tables
GROUP BY
TABLE_NAME,
TABLE_SCHEMA
ORDER BY
SUM(DATA_LENGTH+INDEX_LENGTH) DESC
LIMIT
5;