- Introduction
- Concepts
- Architecture
- Key Features
- Horizontal Scalability
- MySQL Compatible Syntax
- Replicate from and to MySQL
- Distributed Transactions with Strong Consistency
- Cloud Native Architecture
- Minimize ETL with HTAP
- Fault Tolerance & Recovery with Raft
- Automatic Rebalancing
- Deployment and Orchestration with Ansible, Kubernetes, Docker
- JSON Support
- Spark Integration
- Read Historical Data Without Restoring from Backup
- Fast Import and Restore of Data
- Hybrid of Column and Row Storage
- SQL Plan Management
- Open Source
- Online Schema Changes
- How-to
- Get Started
- Deploy
- Hardware Recommendations
- From Binary Tarball
- Orchestrated Deployment
- Geographic Redundancy
- Data Migration with Ansible
- Configure
- Secure
- Transport Layer Security (TLS)
- Generate Self-signed Certificates
- Monitor
- Migrate
- Maintain
- Common Ansible Operations
- Backup and Restore
- Identify Abnormal Queries
- Scale
- Upgrade
- Troubleshoot
- Reference
- SQL
- MySQL Compatibility
- SQL Language Structure
- Data Types
- Functions and Operators
- Function and Operator Reference
- Type Conversion in Expression Evaluation
- Operators
- Control Flow Functions
- String Functions
- Numeric Functions and Operators
- Date and Time Functions
- Bit Functions and Operators
- Cast Functions and Operators
- Encryption and Compression Functions
- Information Functions
- JSON Functions
- Aggregate (GROUP BY) Functions
- Window Functions
- Miscellaneous Functions
- Precision Math
- List of Expressions for Pushdown
- SQL Statements
ADD COLUMN
ADD INDEX
ADMIN
ADMIN CANCEL DDL
ADMIN CHECKSUM TABLE
ADMIN CHECK [TABLE|INDEX]
ADMIN SHOW DDL [JOBS|QUERIES]
ALTER DATABASE
ALTER TABLE
ALTER USER
ANALYZE TABLE
BEGIN
CHANGE COLUMN
COMMIT
CREATE DATABASE
CREATE INDEX
CREATE ROLE
CREATE TABLE LIKE
CREATE TABLE
CREATE USER
CREATE VIEW
DEALLOCATE
DELETE
DESC
DESCRIBE
DO
DROP COLUMN
DROP DATABASE
DROP INDEX
DROP ROLE
DROP TABLE
DROP USER
DROP VIEW
EXECUTE
EXPLAIN ANALYZE
EXPLAIN
FLUSH PRIVILEGES
FLUSH STATUS
FLUSH TABLES
GRANT <privileges>
GRANT <role>
INSERT
KILL [TIDB]
LOAD DATA
LOAD STATS
MODIFY COLUMN
PREPARE
RECOVER TABLE
RENAME INDEX
RENAME TABLE
REPLACE
REVOKE <privileges>
REVOKE <role>
ROLLBACK
SELECT
SET DEFAULT ROLE
SET [NAMES|CHARACTER SET]
SET PASSWORD
SET ROLE
SET TRANSACTION
SET [GLOBAL|SESSION] <variable>
SHOW ANALYZE STATUS
SHOW CHARACTER SET
SHOW COLLATION
SHOW [FULL] COLUMNS FROM
SHOW CREATE TABLE
SHOW CREATE USER
SHOW DATABASES
SHOW ENGINES
SHOW ERRORS
SHOW [FULL] FIELDS FROM
SHOW GRANTS
SHOW INDEXES [FROM|IN]
SHOW INDEX [FROM|IN]
SHOW KEYS [FROM|IN]
SHOW PRIVILEGES
SHOW [FULL] PROCESSSLIST
SHOW SCHEMAS
SHOW STATUS
SHOW [FULL] TABLES
SHOW TABLE REGIONS
SHOW TABLE STATUS
SHOW [GLOBAL|SESSION] VARIABLES
SHOW WARNINGS
SPLIT REGION
START TRANSACTION
TRACE
TRUNCATE
UPDATE
USE
- Constraints
- Generated Columns
- Partitioning
- Character Set
- SQL Mode
- Views
- Configuration
- Security
- Transactions
- System Databases
- Errors Codes
- Supported Client Drivers
- Garbage Collection (GC)
- Performance
- Overview
- Understanding the Query Execution Plan
- The Blocklist of Optimization Rules and Expression Pushdown
- Introduction to Statistics
- TopN and Limit Push Down
- Optimizer Hints
- Check the TiDB Cluster Status Using SQL Statements
- Execution Plan Binding
- Statement Summary Table
- Tune TiKV
- Operating System Tuning
- Column Pruning
- Key Monitoring Metrics
- Alert Rules
- Best Practices
- TiSpark
- TiKV
- TiDB Binlog
- Tools
- TiDB in Kubernetes
- FAQs
- Support
- Contribute
- Releases
- All Releases
- v3.0
- v2.1
- v2.0
- v1.0
- Glossary
You are viewing the documentation of an older version of the TiDB database (TiDB v3.0).
Mydumper Instructions
What is Mydumper?
Mydumper is a fork project optimized for TiDB. You can use this tool for logical backups of MySQL or TiDB.
It can be downloaded as part of the Enterprise Tools package.
What enhancements does it contain over regular Mydumper?
To ensure backup consistency for TiDB, this optimized Mydumper tool sets the value of tidb_snapshot to specify the point in time when the data is backed up instead of using
FLUSH TABLES WITH READ LOCK
.This tool uses the hidden
_tidb_rowid
column of TiDB to optimize the performance of concurrently exporting data from a single table.
Usage
New parameter description
-z
or --tidb-snapshot
: sets the tidb_snapshot
to be used for the backup. The default value is the current TSO (the Position
field output from SHOW MASTER STATUS
). Set this parameter to the TSO or a valid datetime
such as -z "2016-10-08 16:45:26"
.
Required privileges
- SELECT
- RELOAD
- LOCK TABLES
- REPLICATION CLIENT
Usage example
Execute the following command to back up data from TiDB. You can add command line parameters to the command as needed:
./bin/mydumper -h 127.0.0.1 -u root -P 4000
Dump table data concurrently
This section introduces the working principle and parameters of Mydumper. This section also gives an example of Mydumper command, and explains the performance evaluation and the TiDB versions that support the _tidb_rowid
index.
Working principle
Mydumper first calculates min(_tidb_rowid)
and max(_tidb_rowid)
, and segments the table into chunks according to the value specified by -r
. Then, Mydumper assigns these chunks to different threads and dumps these chunks concurrently.
Parameters
-t
or--threads
: specifies the number of concurrent threads (4
by default).-r
or--rows
: specifies the maximum number of rows in a chunk. If this parameter is specified, Mydumper ignores the value of--chunk-filesize
.
Example
The following is a complete Mydumper command:
./bin/mydumper -h 127.0.0.1 -u root -P 4000 -r 10000 -t 4
Performance evaluation
Do a performance evaluation before you perform the dump operation. Because the concurrent scanning brings pressure on the TiDB and TiKV clusters, you need to evaluate and test the impact that the dump operation might have on the database clusters and applications.
TiDB versions that support the _tidb_rowid
index
Because concurrent table data dump uses the implicit _tidb_rowid
row of TiDB, TiDB versions that support the _tidb_rowid
index can fully take advantage of the concurrent dump.
The following TiDB versions supports the _tidb_rowid
index:
- v2.1.3 and later v2.1 versions
- v3.0 (by default)
FAQ
How to resolve the error that occurs when the --tidb-snapshot
option is used to export data?
In this situation, you need to add a --skip-tz-utc
option. Otherwise, Mydumper will pre-configure the UTC time zone and convert the time zone when tidb-snapshot
is configured, which causes this error.
How to determine if the Mydumper I am using is the PingCAP optimized version?
Execute the following command:
./bin/mydumper -V
If the output contains git_hash
(d3e6fec8b069daee772d0dbaa47579f67a5947e7
in the following example), you are using the PingCAP optimized version of Mydumper:
mydumper 0.9.5 (d3e6fec8b069daee772d0dbaa47579f67a5947e7), built against MySQL 5.7.24
How to resolve the "invalid mydumper files for there are no -schema-create.sql
files found" error when using Loader to restore the data backed up by Mydumper?
Check whether the -T
or --tables-list
option is used when using Mydumper to back up data. If these options are used, Mydumper does not generate a file that includes a CREATE DATABASE
SQL statement.
Solution: Create the {schema-name}-schema-create.sql
file in the directory for data backup of Mydumper. Write "CREATE DATABASE {schema-name}
" to the file, and then run Loader.
Why is the TIMESTAMP type of data exported using Mydumper inconsistent with that in the database?
Check whether the time zone of the server that is running Mydumper is consistent with that of the database. Mydumper converts the TIMESTAMP type of data according to the time zone of its server. You can add the --skip-tz-utc
option to disable the conversion of dates and times.
How to configure the -F,--chunk-filesize
option of Mydumper?
Mydumper splits the data of each table into multiple chunks according to the value of this option during backup. Each chunk is saved in a file with a size of about chunk-filesize
. In this way, data is split into multiple files and you can use the parallel processing of Loader/TiDB lightning to improve the import speed. If you later use Loader to restore the backup files, it is recommended to set the value of this option to 64
(in MB); If you use TiDB Lightning to restore files, 256
(in MB) is recommended.
How to configure the -s --statement-size
option of Mydumper?
Mydumper uses this option to control the size of Insert Statement
which defaults to 10000000
(about 1 MB). Use this option to avoid the following errors when restoring data:
packet for query is too large. Try adjusting the 'max_allowed_packet' variable
The default value meets the requirements in most cases, but if it is a wide table, the size of a single row of data might exceed the limit of statement-size
, and Mydumper reports the following warning:
Row bigger than statement_size for xxx
If you restore the data in this situation, Mydumper still reports the packet for query is too large
error. To solve this problem, modify the following two configurations (take 128 MB
as an example):
- Execute
set @@global.max_allowed_packet=134217728
(134217728
=128 MB
) in TiDB server. - Add the
max-allowed-packet=128M
line to the DB configuration of Loader or DM task's configuration file according to your situation. Then, restart the process or task.
How to set the -l, --long-query-guard
option of Mydumper?
Set the value of this option to the estimated time required for a backup. If Mydumper runs longer than this value, it reports an error and exits. It is recommended to set the value to 7200
(in seconds) for the first time of your backup and then modify it according to your actual backup time.
How to set the --tidb-force-priority
option of Mydumper?
This option can only be set when backing up TiDB’s data. It can be set to LOW_PRIORITY
, DELAYED
, or HIGH_PRIORITY
. If you do not want data backup to affect online services, it is recommended to set this option to LOW_PRIORITY
; if the backup has a higher priority, HIGH_PRIORITY
is recommended.
How to resolve the "GC life time is short than transaction duration" error when using Mydumper to back up TiDB's data?
Mydumper uses the tidb_snapshot
system variable to ensure data consistency when backing up TiDB's data. This error is reported if the historical data of a snapshot is cleared by TiDB's Garbage Collection (GC) during backup. To solve this problem, perform the following steps:
Before using Mydumper to back up data, use MySQL client to check the value of
tikv_gc_life_time
in the TiDB cluster and set it to an appropriate value:SELECT * FROM mysql.tidb WHERE VARIABLE_NAME = 'tikv_gc_life_time';
+-----------------------+------------------------------------------------------------------------------------------------+ | VARIABLE_NAME | VARIABLE_VALUE | +-----------------------+------------------------------------------------------------------------------------------------+ | tikv_gc_life_time | 10m0s | +-----------------------+------------------------------------------------------------------------------------------------+ 1 rows in set (0.02 sec)
update mysql.tidb set VARIABLE_VALUE = '720h' where VARIABLE_NAME = 'tikv_gc_life_time';
Set the value of
tikv_gc_life_time
to the initial one after the backup is complete:update mysql.tidb set VARIABLE_VALUE = '10m0s' where VARIABLE_NAME = 'tikv_gc_life_time';
Do I need to configure the --tidb-rowid
option of Mydumper?
If this option is set to true, the exported data contains the data of TiDB's hidden columns. Using hidden columns when restoring data to TiDB might cause data inconsistency. Currently, it is not recommended to use this option.
How to resolve the "Segmentation Fault" error?
This bug has been fixed. If the error persists, you can upgrade to the latest version of Mydumper.
How to resolve the "Error dumping table ({schema}.{table}) data: line ...... (total length ...)" error?
This error occurs when Mydumper parses SQL statements. In this situation, use the latest version of Mydumper. If this error persists, you can file an issue to mydumper/issues.
How to resolve the "Failed to set tidb_snapshot: parsing time \"20190901-10:15:00 +0800\" as \"20190901-10:15:00 +0700 MST\": cannot parse \"\" as \"MST\"" error?
Check whether the version of TiDB is lower than 2.1.11. If so, upgrade to TiDB 2.1.11 or later versions.
Do you plan to make these changes available to upstream Mydumper?
Yes, we intend to make our changes available to upstream Mydumper. See PR #155.
- What is Mydumper?
- Usage
- Dump table data concurrently
- FAQ
- How to resolve the error that occurs when the --tidb-snapshot option is used to export data?
- How to determine if the Mydumper I am using is the PingCAP optimized version?
- How to resolve the "invalid mydumper files for there are no -schema-create.sql files found" error when using Loader to restore the data backed up by Mydumper?
- Why is the TIMESTAMP type of data exported using Mydumper inconsistent with that in the database?
- How to configure the -F,--chunk-filesize option of Mydumper?
- How to configure the -s --statement-size option of Mydumper?
- How to set the -l, --long-query-guard option of Mydumper?
- How to set the --tidb-force-priority option of Mydumper?
- How to resolve the "GC life time is short than transaction duration" error when using Mydumper to back up TiDB's data?
- Do I need to configure the --tidb-rowid option of Mydumper?
- How to resolve the "Segmentation Fault" error?
- How to resolve the "Error dumping table ({schema}.{table}) data: line ...... (total length ...)" error?
- How to resolve the "Failed to set tidb_snapshot: parsing time \"20190901-10:15:00 +0800\" as \"20190901-10:15:00 +0700 MST\": cannot parse \"\" as \"MST\"" error?
- Do you plan to make these changes available to upstream Mydumper?