Continuous Replication from Databases that Use gh-ost or pt-osc
In production scenarios, table locking during DDL execution can block the reads from or writes to the database to a certain extent. Therefore, online DDL tools are often used to execute DDLs to minimize the impact on reads and writes. Common DDL tools are gh-ost and pt-osc.
When using DM to migrate data from MySQL to TiDB, you can enable online-ddl
to allow collaboration of DM and gh-ost or pt-osc.
For the detailed replication instructions, refer to the following documents by scenarios:
- Migrate Small Datasets from MySQL to TiDB
- Migrate Large Datasets from MySQL to TiDB
- Migrate and Merge MySQL Shards of Small Datasets to TiDB
- Migrate and Merge MySQL Shards of Large Datasets to TiDB
Enable online-ddl on DM
In the task configuration file of DM, set the global parameter online-ddl
to true
, as shown below:
# ----------- Global configuration -----------
## ********* Basic configuration *********
name: test # The name of the task. Should be globally unique.
task-mode: all # The task mode. Can be set to `full`, `incremental`, or `all`.
shard-mode: "pessimistic" # The shard merge mode. Optional modes are `pessimistic` and `optimistic`. The `pessimistic` mode is used by default. After understanding the principles and restrictions of the "optimistic" mode, you can set it to the "optimistic" mode.
meta-schema: "dm_meta" # The downstream database that stores the `meta` information.
online-ddl: true # Enable online-ddl support on DM to support automatic processing of "gh-ost" and "pt-osc" for the upstream database.
Workflow after enabling online-ddl
After online-ddl is enabled on DM, the DDL statements generated by DM replicating gh-ost or pt-osc will change.
The workflow of gh-ost or pt-osc:
Create a ghost table according to the table schema of the DDL real table.
Apply DDLs on the ghost table.
Replicate the data of the DDL real table to the ghost table.
After the data are consistent between the two tables, use the rename statement to replace the real table with the ghost table.
The workflow of DM:
Skip creating the ghost table downstream.
Record DDLs applied to the ghost table.
Replicate data only from the ghost table.
Apply DDLs recorded downstream.
The change in the workflow brings the following advantages:
The downstream TiDB does not need to create and replicate the ghost table, saving the storage space and network transmission overhead.
When you migrate and merge data from sharded tables, the RENAME operation is ignored for each sharded ghost table to ensure the correctness of the replication.