TiDB Data Migration (DM) is an integrated data migration task management platform, which supports the full data migration and the incremental data replication from MySQL-compatible databases (such as MySQL, MariaDB, and Aurora MySQL) into TiDB. It can help to reduce the operation cost of data migration and simplify the troubleshooting process. When using DM for data migration, you need to perform the following operations:
- Deploy a DM Cluster
- Create upstream data source and save data source access information
- Create data migration tasks to migrate data from data sources to TiDB
The data migration task includes two stages: full data migration and incremental data replication:
- Full data migration: Migrate the table structure of the corresponding table from the data source to TiDB, and then read the data stored in the data source and write it to the TiDB cluster.
- Incremental data replication: After the full data migration is completed, the corresponding table changes from the data source are read and then written to the TiDB cluster.
The stable versions of DM include v1.0, v2.0, and v5.3. It is recommended to use DM v5.3 (the latest stable version of DM) and not recommended to use v1.0 (the earliest stable version of DM).
For v5.3 and earlier releases, the DM documentation is independent of the TiDB documentation. To access the DM documentation, click one of the following links:
- Since October 2021, DM's GitHub repository has been moved to pingcap/tiflow. If you see any issues with DM, submit your issue to the
pingcap/tiflowrepository for feedback.
- In earlier versions (v1.0 and v2.0), DM uses version numbers that are independent of TiDB. Since v5.3, DM uses the same version number as TiDB. The next version of DM v2.0 is DM v5.3. There are no compatibility changes from DM v2.0 to v5.3, and the upgrade process is no different from a normal upgrade, only an increase in version number.
This section describes the basic data migration features provided by DM.
The block and allow lists filtering rule is similar to the
replication-rules-table feature of MySQL, which can be used to filter or replicate all operations of some databases only or some tables only.
The binlog event filtering feature means that DM can filter certain types of SQL statements from certain tables in the source database. For example, you can filter all
INSERT statements in the table
sbtest or filter all
TRUNCATE TABLE statements in the schema
The schema and table routing feature means that DM can migrate a certain table of the source database to the specified table in the downstream. For example, you can migrate the table structure and data from the table
sbtest1 in the source database to the table
sbtest2 in TiDB. This is also a core feature for merging and migrating sharded databases and tables.
DM supports merging and migrating the original sharded instances and tables from the source databases into TiDB, but with some restrictions. For details, see Sharding DDL usage restrictions in the pessimistic mode and Sharding DDL usage restrictions in the optimistic mode.
In the MySQL ecosystem, tools such as gh-ost and pt-osc are widely used. DM provides support for these tools to avoid migrating unnecessary intermediate data. For details, see Online DDL Tools
In the phase of incremental replication, DM supports the configuration of SQL expressions to filter out certain row changes, which lets you replicate the data with a greater granularity. For more information, refer to Filter Certain Row Changes Using SQL Expressions.
Before using the DM tool, note the following restrictions:
Database version requirements
MySQL version > 5.5
MariaDB version >= 10.1.2
If there is a primary-secondary migration structure between the upstream MySQL/MariaDB servers, then choose the following version.
- MySQL version > 5.7.1
- MariaDB version >= 10.1.3
Migrating data from MySQL 8.0 to TiDB using DM is an experimental feature (introduced since DM v2.0). It is NOT recommended that you use it in a production environment.
DDL syntax compatibility
Currently, TiDB is not compatible with all the DDL statements that MySQL supports. Because DM uses the TiDB parser to process DDL statements, it only supports the DDL syntax supported by the TiDB parser. For details, see MySQL Compatibility.
DM reports an error when it encounters an incompatible DDL statement. To solve this error, you need to manually handle it using dmctl, either skipping this DDL statement or replacing it with a specified DDL statement(s). For details, see Skip or replace abnormal SQL statements.
Sharding merge with conflicts
If conflict exists between sharded tables, solve the conflict by referring to handling conflicts of auto-increment primary key. Otherwise, data migration is not supported. Conflicting data can cover each other and cause data loss.
For other sharding DDL migration restrictions, see Sharding DDL usage restrictions in the pessimistic mode and Sharding DDL usage restrictions in the optimistic mode.
Switch of MySQL instances for data sources
When DM-worker connects the upstream MySQL instance via a virtual IP (VIP), if you switch the VIP connection to another MySQL instance, DM might connect to the new and old MySQL instances at the same time in different connections. In this situation, the binlog migrated to DM is not consistent with other upstream status that DM receives, causing unpredictable anomalies and even data damage. To make necessary changes to DM manually, see Switch DM-worker connection via virtual IP.
- Data Migration Overview
- DM versions
- Basic features
- Advanced features
- Usage restrictions