- Introduction
- Concepts
- Architecture
- Key Features
- Horizontal Scalability
- MySQL Compatible Syntax
- Replicate from and to MySQL
- Distributed Transactions with Strong Consistency
- Cloud Native Architecture
- Minimize ETL with HTAP
- Fault Tolerance & Recovery with Raft
- Automatic Rebalancing
- Deployment and Orchestration with Ansible, Kubernetes, Docker
- JSON Support
- Spark Integration
- Read Historical Data Without Restoring from Backup
- Fast Import and Restore of Data
- Hybrid of Column and Row Storage
- SQL Plan Management
- Open Source
- Online Schema Changes
- How-to
- Get Started
- Deploy
- Hardware Recommendations
- From Binary Tarball
- Orchestrated Deployment
- Geographic Redundancy
- Data Migration with Ansible
- Configure
- Secure
- Transport Layer Security (TLS)
- Generate Self-signed Certificates
- Monitor
- Migrate
- Maintain
- Common Ansible Operations
- Backup and Restore
- Identify Abnormal Queries
- Scale
- Upgrade
- Troubleshoot
- Reference
- SQL
- MySQL Compatibility
- SQL Language Structure
- Data Types
- Functions and Operators
- Function and Operator Reference
- Type Conversion in Expression Evaluation
- Operators
- Control Flow Functions
- String Functions
- Numeric Functions and Operators
- Date and Time Functions
- Bit Functions and Operators
- Cast Functions and Operators
- Encryption and Compression Functions
- Information Functions
- JSON Functions
- Aggregate (GROUP BY) Functions
- Window Functions
- Miscellaneous Functions
- Precision Math
- List of Expressions for Pushdown
- SQL Statements
ADD COLUMN
ADD INDEX
ADMIN
ADMIN CANCEL DDL
ADMIN CHECKSUM TABLE
ADMIN CHECK [TABLE|INDEX]
ADMIN SHOW DDL [JOBS|QUERIES]
ALTER DATABASE
ALTER TABLE
ALTER USER
ANALYZE TABLE
BEGIN
CHANGE COLUMN
COMMIT
CREATE DATABASE
CREATE INDEX
CREATE ROLE
CREATE TABLE LIKE
CREATE TABLE
CREATE USER
CREATE VIEW
DEALLOCATE
DELETE
DESC
DESCRIBE
DO
DROP COLUMN
DROP DATABASE
DROP INDEX
DROP ROLE
DROP TABLE
DROP USER
DROP VIEW
EXECUTE
EXPLAIN ANALYZE
EXPLAIN
FLUSH PRIVILEGES
FLUSH STATUS
FLUSH TABLES
GRANT <privileges>
GRANT <role>
INSERT
KILL [TIDB]
LOAD DATA
LOAD STATS
MODIFY COLUMN
PREPARE
RECOVER TABLE
RENAME INDEX
RENAME TABLE
REPLACE
REVOKE <privileges>
REVOKE <role>
ROLLBACK
SELECT
SET DEFAULT ROLE
SET [NAMES|CHARACTER SET]
SET PASSWORD
SET ROLE
SET TRANSACTION
SET [GLOBAL|SESSION] <variable>
SHOW ANALYZE STATUS
SHOW CHARACTER SET
SHOW COLLATION
SHOW [FULL] COLUMNS FROM
SHOW CREATE TABLE
SHOW CREATE USER
SHOW DATABASES
SHOW ENGINES
SHOW ERRORS
SHOW [FULL] FIELDS FROM
SHOW GRANTS
SHOW INDEXES [FROM|IN]
SHOW INDEX [FROM|IN]
SHOW KEYS [FROM|IN]
SHOW PRIVILEGES
SHOW [FULL] PROCESSSLIST
SHOW SCHEMAS
SHOW STATUS
SHOW [FULL] TABLES
SHOW TABLE REGIONS
SHOW TABLE STATUS
SHOW [GLOBAL|SESSION] VARIABLES
SHOW WARNINGS
SPLIT REGION
START TRANSACTION
TRACE
TRUNCATE
UPDATE
USE
- Constraints
- Generated Columns
- Partitioning
- Character Set
- SQL Mode
- Views
- Configuration
- Security
- Transactions
- System Databases
- Errors Codes
- Supported Client Drivers
- Garbage Collection (GC)
- Performance
- Overview
- Understanding the Query Execution Plan
- The Blocklist of Optimization Rules and Expression Pushdown
- Introduction to Statistics
- TopN and Limit Push Down
- Optimizer Hints
- Check the TiDB Cluster Status Using SQL Statements
- Execution Plan Binding
- Statement Summary Table
- Tune TiKV
- Operating System Tuning
- Column Pruning
- Key Monitoring Metrics
- Alert Rules
- Best Practices
- TiSpark
- TiKV
- TiDB Binlog
- Tools
- TiDB in Kubernetes
- FAQs
- Support
- Contribute
- Releases
- All Releases
- v3.0
- v2.1
- v2.0
- v1.0
- Glossary
You are viewing the documentation of an older version of the TiDB database (TiDB v3.0).
Export or Backup Data Using Dumpling
This document introduces how to use the Dumpling tool to export or backup data in TiDB. Dumpling exports data stored in TiDB as SQL or CSV data files and can be used to make a logical full backup or export.
For detailed usage of Dumpling, use the --help
command or refer to Dumpling User Guide.
When using Dumpling, you need to execute the export command on a running cluster. This document assumes that there is a TiDB instance on the 127.0.0.1:4000
host and that this TiDB instance has a root user without a password.
Download Dumpling
To download the latest version of Dumpling, click the download link.
Export data from TiDB
Export to SQL files
Dumpling exports data to SQL files by default. You can also export data to SQL files by adding the --filetype sql
flag:
dumpling \
-u root \
-P 4000 \
-h 127.0.0.1 \
--filetype sql \
--threads 32 \
-o /tmp/test \
-F 256
In the above command, -h
, -P
and -u
mean address, port and user, respectively. If password authentication is required, you can pass it to Dumpling with -p $YOUR_SECRET_PASSWORD
.
Export to CSV files
If Dumpling exports data to CSV files (use --filetype csv
to export to CSV files), you can also use --sql <SQL>
to export the records selected by the specified SQL statement.
For example, you can export all records that match id < 100
in test.sbtest1
using the following command:
./dumpling \
-u root \
-P 4000 \
-h 127.0.0.1 \
-o /tmp/test \
--filetype csv \
--sql 'select * from `test`.`sbtest1` where id < 100'
Currently, the
--sql
option can be used only for exporting to CSV files.Here you need to execute the
select * from <table-name> where id <100
statement on all tables to be exported. If some tables do not have specified fields, the export fails.
Filter the exported data
Use the --where
command to filter data
By default, Dumpling exports the tables of the entire database except the tables in the system databases. You can use --where <SQL where expression>
to select the records to be exported.
./dumpling \
-u root \
-P 4000 \
-h 127.0.0.1 \
-o /tmp/test \
--where "id < 100"
The above command exports the data that matches id < 100
from each table.
Use the --filter
command to filter data
Dumpling can filter specific databases or tables by specifying the table filter with the --filter
command. The syntax of table filters is similar to that of .gitignore
. For details, see Table Filter.
./dumpling \
-u root \
-P 4000 \
-h 127.0.0.1 \
-o /tmp/test \
--filter "employees.*" \
--filter "*.WorkOrder"
The above command exports all the tables in the employees
database and the WorkOrder
tables in all databases.
Use the -B
or -T
command to filter data
Dumpling can also export specific databases with the -B
command or specific tables with the -T
command.
The
--filter
command and the-T
command cannot be used at the same time.The
-T
command can only accept a complete form of inputs likedatabase-name.table-name
, and inputs with only the table name are not accepted. Example: Dumpling cannot recognize-T WorkOrder
.
Examples:
--B employees
exports the employees
database
--T employees.WorkOrder
exports the employees.WorkOrder
table
Improve export efficiency through concurrency
The exported file is stored in the ./export-<current local time>
directory by default. Commonly used parameters are as follows:
-o
is used to select the directory where the exported files are stored.-F
option is used to specify the maximum size of a single file (the unit here isMiB
; inputs like5GiB
or8KB
are also acceptable).-r
option is used to specify the maximum number of records (or the number of rows in the database) for a single file. When it is enabled, Dumpling enables concurrency in the table to improve the speed of exporting large tables.
You can use the above parameters to provide Dumpling with a higher degree of concurrency.
Adjust Dumpling's data consistency options
In most scenarios, you do not need to adjust the default data consistency options of Dumpling.
Dumpling uses the --consistency <consistency level>
option to control the way in which data is exported for "consistency assurance". For TiDB, data consistency is guaranteed by getting a snapshot of a certain timestamp by default (i.e. --consistency snapshot
). When using snapshot for consistency, you can use the --snapshot
parameter to specify the timestamp to be backed up. You can also use the following levels of consistency:
flush
: UseFLUSH TABLES WITH READ LOCK
to ensure consistency.snapshot
: Get a consistent snapshot of the specified timestamp and export it.lock
: Add read locks on all tables to be exported.none
: No guarantee for consistency.auto
: Useflush
for MySQL andsnapshot
for TiDB.
After everything is done, you can see the exported file in /tmp/test
:
ls -lh /tmp/test | awk '{print $5 "\t" $9}'
140B metadata
66B test-schema-create.sql
300B test.sbtest1-schema.sql
190K test.sbtest1.0.sql
300B test.sbtest2-schema.sql
190K test.sbtest2.0.sql
300B test.sbtest3-schema.sql
190K test.sbtest3.0.sql
In addition, if the data volume is very large, to avoid export failure due to GC during the export process, you can extend the GC time in advance:
update mysql.tidb set VARIABLE_VALUE = '720h' where VARIABLE_NAME = 'tikv_gc_life_time';
After your operation is completed, set the GC time back (the default value is 10m
):
update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life_time';
Finally, all the exported data can be imported back to TiDB using Lightning.