Migrate Data from Vitess to TiDB
This document describes the tools that you can use to migrate data from Vitess to TiDB.
Because the backend of Vitess is based on MySQL, when migrating data from Vitess to TiDB, you can use the same migration tools that apply to MySQL, such as Dumpling, TiDB Lightning, and TiDB Data Migration (DM). Note that these tools should be set up for each shard in Vitess for data migration.
Generally, before data migration, it is recommended to configure the DM task to set task-mode
to all
and import-mode
to physical
. For more information, see Task configuration file template (advanced).
If your data size exceeds 10 TiB, it is recommended to do the import in two steps:
- Use Dumpling and TiDB Lightning to import existing data.
- Use DM to import incremental data.
In addition to these tools, you can also use Debezium connector for Vitess. This connector enables you to use Kafka Connect or Apache Flink to stream changes from Vitess to TiDB.
Because both Vitess and TiDB support the MySQL protocol and SQL dialect, changes at the application level are expected to be small. For tasks directly managing sharding or other implementation-specific aspects, however, the changes might be larger. To facilitate the data migration from Vitess to TiDB, TiDB introduces the VITESS_HASH()
function, which returns the hash of a string that is compatible with Vitess' HASH function.
Examples
Dumpling and TiDB Lightning
The following two examples show how Dumpling and TiDB Lightning work together to migrate data from Vitess to TiDB.
In this example, TiDB Lightning uses the logical import mode, which first encodes data into SQL statements and then runs the SQL statements to import data.
In this example, TiDB Lightning uses the physical import mode to directly ingest data into TiKV.
DM
The following example shows how DM migrates data from Vitess to TiDB.