Handle Failed DDL Statements
This document introduces how to handle failed DDL statements when you're using the TiDB Data Migration (DM) tool to migrate data.
Currently, TiDB is not completely compatible with all MySQL syntax (see the DDL statements supported by TiDB). Therefore, when DM is migrating data from MySQL to TiDB and TiDB does not support the corresponding DDL statement, an error might occur and break the migration process. In this case, you can use the handle-error
command of DM to resume the migration.
Restrictions
If it is unacceptable in the actual production environment that the failed DDL statement is skipped in the downstream TiDB and it cannot be replaced with other DDL statements, then do not use this command.
For example, DROP PRIMARY KEY
. In this scenario, you can only create a new table in the downstream with the new table schema (after executing the DDL statement), and re-import all the data into this new table.
Supported scenarios
During the migration, the DDL statement unsupported by TiDB is executed in the upstream and migrated to the downstream, and as a result, the migration task gets interrupted.
- If it is acceptable that this DDL statement is skipped in the downstream TiDB, then you can use
handle-error <task-name> skip
to skip migrating this DDL statement and resume the migration. - If it is acceptable that this DDL statement is replaced with other DDL statements, then you can use
handle-error <task-name> replace
to replace this DDL statement and resume the migration.
Command
When you use dmctl to manually handle the failed DDL statements, the commonly used commands include query-status
and handle-error
.
query-status
The query-status
command is used to query the current status of items such as the subtask and the relay unit in each MySQL instance. For details, see query status.
handle-error
The handle-error
command is used to handle the failed DDL statements.
Command usage
» handle-error -h
Usage:
dmctl handle-error <task-name | task-file> [-s source ...] [-b binlog-pos] <skip/replace/revert> [replace-sql1;replace-sql2;] [flags]
Flags:
-b, --binlog-pos string position used to match binlog event if matched the handler-error operation will be applied. The format like "mysql-bin|000001.000003:3270"
-h, --help help for handle-error
Global Flags:
-s, --source strings MySQL Source ID
Flags descriptions
task-name
:- Non-flag parameter, string, required
task-name
specifies the name of the task in which the presetted operation is going to be executed.
source
:- Flag parameter, string,
--source
source
specifies the MySQL instance in which the preset operation is to be executed.
- Flag parameter, string,
skip
: Skip the errorreplace
: Replace the failed DDL statementrevert
: Reset the previous skip/replace operation before the error occurs (only reset it when the previous skip/replace operation has not finally taken effect)binlog-pos
:- Flag parameter, string,
--binlog-pos
- If it is not specified, DM automatically handles the currently failed DDL statement.
- If it is specified, the skip operation is executed when
binlog-pos
matches with the position of the binlog event. The format isbinlog-filename:binlog-pos
, for example,mysql-bin|000001.000003:3270
. - After the migration returns an error, the binlog position can be obtained from
position
instartLocation
returned byquery-status
. Before the migration returns an error, the binlog position can be obtained by usingSHOW BINLOG EVENTS
in the upstream MySQL instance.
- Flag parameter, string,
Usage examples
Skip DDL if the migration gets interrupted
Non-shard-merge scenario
Assume that you need to migrate the upstream table db1.tbl1
to the downstream TiDB. The initial table schema is:
SHOW CREATE TABLE db1.tbl1;
+-------+--------------------------------------------------+
| Table | Create Table |
+-------+--------------------------------------------------+
| tbl1 | CREATE TABLE `tbl1` (
`c1` int(11) NOT NULL,
`c2` decimal(11,3) DEFAULT NULL,
PRIMARY KEY (`c1`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+--------------------------------------------------+
Now, the following DDL statement is executed in the upstream to alter the table schema (namely, alter DECIMAL(11, 3) of c2 into DECIMAL(10, 3)):
ALTER TABLE db1.tbl1 CHANGE c2 c2 DECIMAL (10, 3);
Because this DDL statement is not supported by TiDB, the migration task of DM gets interrupted. Execute the query-status <task-name>
command, and you can see the following error:
ERROR 8200 (HY000): Unsupported modify column: can't change decimal column precision
Assume that it is acceptable in the actual production environment that this DDL statement is not executed in the downstream TiDB (namely, the original table schema is retained). Then you can use handle-error <task-name> skip
to skip this DDL statement to resume the migration. The procedures are as follows:
Execute
handle-error <task-name> skip
to skip the currently failed DDL statement:» handle-error test skip{ "result": true, "msg": "", "sources": [ { "result": true, "msg": "", "source": "mysql-replica-01", "worker": "worker1" } ] }Execute
query-status <task-name>
to view the task status:» query-status testSee the execution result.
{ "result": true, "msg": "", "sources": [ { "result": true, "msg": "", "sourceStatus": { "source": "mysql-replica-01", "worker": "worker1", "result": null, "relayStatus": null }, "subTaskStatus": [ { "name": "test", "stage": "Running", "unit": "Sync", "result": null, "unresolvedDDLLockID": "", "sync": { "totalEvents": "4", "totalTps": "0", "recentTps": "0", "masterBinlog": "(DESKTOP-T561TSO-bin.000001, 2388)", "masterBinlogGtid": "143bdef3-dd4a-11ea-8b00-00155de45f57:1-10", "syncerBinlog": "(DESKTOP-T561TSO-bin.000001, 2388)", "syncerBinlogGtid": "143bdef3-dd4a-11ea-8b00-00155de45f57:1-4", "blockingDDLs": [ ], "unresolvedGroups": [ ], "synced": true, "binlogType": "remote" } } ] } ] }You can see that the task runs normally and the wrong DDL is skipped.
Shard merge scenario
Assume that you need to merge and migrate the following four tables in the upstream to one same table `shard_db`.`shard_table`
in the downstream. The task mode is "pessimistic".
- MySQL instance 1 contains the
shard_db_1
schema, which includes theshard_table_1
andshard_table_2
tables. - MySQL instance 2 contains the
shard_db_2
schema, which includes theshard_table_1
andshard_table_2
tables.
The initial table schema is:
SHOW CREATE TABLE shard_db.shard_table;
+-------+-----------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-----------------------------------------------------------------------------------------------------------+
| tb | CREATE TABLE `shard_table` (
`id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_bin |
+-------+-----------------------------------------------------------------------------------------------------------+
Now, execute the following DDL statement to all upstream sharded tables to alter their character set:
ALTER TABLE `shard_db_*`.`shard_table_*` CHARACTER SET LATIN1 COLLATE LATIN1_DANISH_CI;
Because this DDL statement is not supported by TiDB, the migration task of DM gets interrupted. Execute the query-status
command, and you can see the following errors reported by the shard_db_1
.shard_table_1
table in MySQL instance 1 and the shard_db_2
.shard_table_1
table in MySQL instance 2:
{
"Message": "cannot track DDL: ALTER TABLE `shard_db_1`.`shard_table_1` CHARACTER SET UTF8 COLLATE UTF8_UNICODE_CI",
"RawCause": "[ddl:8200]Unsupported modify charset from latin1 to utf8"
}
{
"Message": "cannot track DDL: ALTER TABLE `shard_db_2`.`shard_table_1` CHARACTER SET UTF8 COLLATE UTF8_UNICODE_CI",
"RawCause": "[ddl:8200]Unsupported modify charset from latin1 to utf8"
}
Assume that it is acceptable in the actual production environment that this DDL statement is not executed in the downstream TiDB (namely, the original table schema is retained). Then you can use handle-error <task-name> skip
to skip this DDL statement to resume the migration. The procedures are as follows:
Execute
handle-error <task-name> skip
to skip the currently failed DDL statements in MySQL instance 1 and 2:» handle-error test skip{ "result": true, "msg": "", "sources": [ { "result": true, "msg": "", "source": "mysql-replica-01", "worker": "worker1" }, { "result": true, "msg": "", "source": "mysql-replica-02", "worker": "worker2" } ] }Execute the
query-status
command, and you can see the errors reported by theshard_db_1
.shard_table_2
table in MySQL instance 1 and theshard_db_2
.shard_table_2
table in MySQL instance 2:{ "Message": "cannot track DDL: ALTER TABLE `shard_db_1`.`shard_table_2` CHARACTER SET UTF8 COLLATE UTF8_UNICODE_CI", "RawCause": "[ddl:8200]Unsupported modify charset from latin1 to utf8" }{ "Message": "cannot track DDL: ALTER TABLE `shard_db_2`.`shard_table_2` CHARACTER SET UTF8 COLLATE UTF8_UNICODE_CI", "RawCause": "[ddl:8200]Unsupported modify charset from latin1 to utf8" }Execute
handle-error <task-name> skip
again to skip the currently failed DDL statements in MySQL instance 1 and 2:» handle-error test skip{ "result": true, "msg": "", "sources": [ { "result": true, "msg": "", "source": "mysql-replica-01", "worker": "worker1" }, { "result": true, "msg": "", "source": "mysql-replica-02", "worker": "worker2" } ] }Use
query-status <task-name>
to view the task status:» query-status testSee the execution result.
{ "result": true, "msg": "", "sources": [ { "result": true, "msg": "", "sourceStatus": { "source": "mysql-replica-01", "worker": "worker1", "result": null, "relayStatus": null }, "subTaskStatus": [ { "name": "test", "stage": "Running", "unit": "Sync", "result": null, "unresolvedDDLLockID": "", "sync": { "totalEvents": "4", "totalTps": "0", "recentTps": "0", "masterBinlog": "(DESKTOP-T561TSO-bin.000001, 2388)", "masterBinlogGtid": "143bdef3-dd4a-11ea-8b00-00155de45f57:1-10", "syncerBinlog": "(DESKTOP-T561TSO-bin.000001, 2388)", "syncerBinlogGtid": "143bdef3-dd4a-11ea-8b00-00155de45f57:1-4", "blockingDDLs": [ ], "unresolvedGroups": [ ], "synced": true, "binlogType": "remote" } } ] }, { "result": true, "msg": "", "sourceStatus": { "source": "mysql-replica-02", "worker": "worker2", "result": null, "relayStatus": null }, "subTaskStatus": [ { "name": "test", "stage": "Running", "unit": "Sync", "result": null, "unresolvedDDLLockID": "", "sync": { "totalEvents": "4", "totalTps": "0", "recentTps": "0", "masterBinlog": "(DESKTOP-T561TSO-bin.000001, 2388)", "masterBinlogGtid": "143bdef3-dd4a-11ea-8b00-00155de45f57:1-10", "syncerBinlog": "(DESKTOP-T561TSO-bin.000001, 2388)", "syncerBinlogGtid": "143bdef3-dd4a-11ea-8b00-00155de45f57:1-4", "blockingDDLs": [ ], "unresolvedGroups": [ ], "synced": true, "binlogType": "remote" } } ] } ] }You can see that the task runs normally with no error and all four wrong DDL statements are skipped.
Replace DDL if the migration gets interrupted
Non-shard-merge scenario
Assume that you need to migrate the upstream table db1.tbl1
to the downstream TiDB. The initial table schema is:
SHOW CREATE TABLE db1.tbl1;
+-------+-----------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-----------------------------------------------------------------------------------------------------------+
| tb | CREATE TABLE `tbl1` (
`id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_bin |
+-------+-----------------------------------------------------------------------------------------------------------+
Now, perform the following DDL operation in the upstream to add a new column with the UNIQUE constraint:
ALTER TABLE `db1`.`tbl1` ADD COLUMN new_col INT UNIQUE;
Because this DDL statement is not supported by TiDB, the migration task gets interrupted. Execute the query-status
command, and you can see the following error:
{
"Message": "cannot track DDL: ALTER TABLE `db1`.`tbl1` ADD COLUMN `new_col` INT UNIQUE KEY",
"RawCause": "[ddl:8200]unsupported add column 'new_col' constraint UNIQUE KEY when altering 'db1.tbl1'",
}
You can replace this DDL statement with two equivalent DDL statements. The steps are as follows:
Replace the wrong DDL statement by the following command:
» handle-error test replace "ALTER TABLE `db1`.`tbl1` ADD COLUMN `new_col` INT;ALTER TABLE `db1`.`tbl1` ADD UNIQUE(`new_col`)";{ "result": true, "msg": "", "sources": [ { "result": true, "msg": "", "source": "mysql-replica-01", "worker": "worker1" } ] }Use
query-status <task-name>
to view the task status:» query-status testSee the execution result.
{ "result": true, "msg": "", "sources": [ { "result": true, "msg": "", "sourceStatus": { "source": "mysql-replica-01", "worker": "worker1", "result": null, "relayStatus": null }, "subTaskStatus": [ { "name": "test", "stage": "Running", "unit": "Sync", "result": null, "unresolvedDDLLockID": "", "sync": { "totalEvents": "4", "totalTps": "0", "recentTps": "0", "masterBinlog": "(DESKTOP-T561TSO-bin.000001, 2388)", "masterBinlogGtid": "143bdef3-dd4a-11ea-8b00-00155de45f57:1-10", "syncerBinlog": "(DESKTOP-T561TSO-bin.000001, 2388)", "syncerBinlogGtid": "143bdef3-dd4a-11ea-8b00-00155de45f57:1-4", "blockingDDLs": [ ], "unresolvedGroups": [ ], "synced": true, "binlogType": "remote" } } ] } ] }You can see that the task runs normally and the wrong DDL statement is replaced by new DDL statements that execute successfully.
Shard merge scenario
Assume that you need to merge and migrate the following four tables in the upstream to one same table `shard_db`.`shard_table`
in the downstream. The task mode is "pessimistic".
- In the MySQL instance 1, there is a schema
shard_db_1
, which has two tablesshard_table_1
andshard_table_2
. - In the MySQL instance 2, there is a schema
shard_db_2
, which has two tablesshard_table_1
andshard_table_2
.
The initial table schema is:
SHOW CREATE TABLE shard_db.shard_table;
+-------+-----------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-----------------------------------------------------------------------------------------------------------+
| tb | CREATE TABLE `shard_table` (
`id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_bin |
+-------+-----------------------------------------------------------------------------------------------------------+
Now, perform the following DDL operation to all upstream sharded tables to add a new column with the UNIQUE constraint:
ALTER TABLE `shard_db_*`.`shard_table_*` ADD COLUMN new_col INT UNIQUE;
Because this DDL statement is not supported by TiDB, the migration task gets interrupted. Execute the query-status
command, and you can see the following errors reported by the shard_db_1
.shard_table_1
table in MySQL instance 1 and the shard_db_2
.shard_table_1
table in MySQL instance 2:
{
"Message": "cannot track DDL: ALTER TABLE `shard_db_1`.`shard_table_1` ADD COLUMN `new_col` INT UNIQUE KEY",
"RawCause": "[ddl:8200]unsupported add column 'new_col' constraint UNIQUE KEY when altering 'shard_db_1.shard_table_1'",
}
{
"Message": "cannot track DDL: ALTER TABLE `shard_db_2`.`shard_table_1` ADD COLUMN `new_col` INT UNIQUE KEY",
"RawCause": "[ddl:8200]unsupported add column 'new_col' constraint UNIQUE KEY when altering 'shard_db_2.shard_table_1'",
}
You can replace this DDL statement with two equivalent DDL statements. The steps are as follows:
Replace the wrong DDL statements respectively in MySQL instance 1 and MySQL instance 2 by the following commands:
» handle-error test -s mysql-replica-01 replace "ALTER TABLE `shard_db_1`.`shard_table_1` ADD COLUMN `new_col` INT;ALTER TABLE `shard_db_1`.`shard_table_1` ADD UNIQUE(`new_col`)";{ "result": true, "msg": "", "sources": [ { "result": true, "msg": "", "source": "mysql-replica-01", "worker": "worker1" } ] }» handle-error test -s mysql-replica-02 replace "ALTER TABLE `shard_db_2`.`shard_table_1` ADD COLUMN `new_col` INT;ALTER TABLE `shard_db_2`.`shard_table_1` ADD UNIQUE(`new_col`)";{ "result": true, "msg": "", "sources": [ { "result": true, "msg": "", "source": "mysql-replica-02", "worker": "worker2" } ] }Use
query-status <task-name>
to view the task status, and you can see the following errors reported by theshard_db_1
.shard_table_2
table in MySQL instance 1 and theshard_db_2
.shard_table_2
table in MySQL instance 2:{ "Message": "detect inconsistent DDL sequence from source ... ddls: [ALTER TABLE `shard_db`.`tb` ADD COLUMN `new_col` INT UNIQUE KEY] source: `shard_db_1`.`shard_table_2`], right DDL sequence should be ..." }{ "Message": "detect inconsistent DDL sequence from source ... ddls: [ALTER TABLE `shard_db`.`tb` ADD COLUMN `new_col` INT UNIQUE KEY] source: `shard_db_2`.`shard_table_2`], right DDL sequence should be ..." }Execute
handle-error <task-name> replace
again to replace the wrong DDL statements in MySQL instance 1 and 2:» handle-error test -s mysql-replica-01 replace "ALTER TABLE `shard_db_1`.`shard_table_2` ADD COLUMN `new_col` INT;ALTER TABLE `shard_db_1`.`shard_table_2` ADD UNIQUE(`new_col`)";{ "result": true, "msg": "", "sources": [ { "result": true, "msg": "", "source": "mysql-replica-01", "worker": "worker1" } ] }» handle-error test -s mysql-replica-02 replace "ALTER TABLE `shard_db_2`.`shard_table_2` ADD COLUMN `new_col` INT;ALTER TABLE `shard_db_2`.`shard_table_2` ADD UNIQUE(`new_col`)";{ "result": true, "msg": "", "sources": [ { "result": true, "msg": "", "source": "mysql-replica-02", "worker": "worker2" } ] }Use
query-status <task-name>
to view the task status:» query-status testSee the execution result.
{ "result": true, "msg": "", "sources": [ { "result": true, "msg": "", "sourceStatus": { "source": "mysql-replica-01", "worker": "worker1", "result": null, "relayStatus": null }, "subTaskStatus": [ { "name": "test", "stage": "Running", "unit": "Sync", "result": null, "unresolvedDDLLockID": "", "sync": { "totalEvents": "4", "totalTps": "0", "recentTps": "0", "masterBinlog": "(DESKTOP-T561TSO-bin.000001, 2388)", "masterBinlogGtid": "143bdef3-dd4a-11ea-8b00-00155de45f57:1-10", "syncerBinlog": "(DESKTOP-T561TSO-bin.000001, 2388)", "syncerBinlogGtid": "143bdef3-dd4a-11ea-8b00-00155de45f57:1-4", "blockingDDLs": [ ], "unresolvedGroups": [ ], "unresolvedGroups": [ ], "synced": true, "binlogType": "remote" } } ] }, { "result": true, "msg": "", "sourceStatus": { "source": "mysql-replica-02", "worker": "worker2", "result": null, "relayStatus": null }, "subTaskStatus": [ { "name": "test", "stage": "Running", "unit": "Sync", "result": null, "unresolvedDDLLockID": "", "sync": { "totalEvents": "4", "totalTps": "0", "recentTps": "0", "masterBinlog": "(DESKTOP-T561TSO-bin.000001, 2388)", "masterBinlogGtid": "143bdef3-dd4a-11ea-8b00-00155de45f57:1-10", "syncerBinlog": "(DESKTOP-T561TSO-bin.000001, 2388)", "syncerBinlogGtid": "143bdef3-dd4a-11ea-8b00-00155de45f57:1-4", "blockingDDLs": [ ], "unresolvedGroups": [ ], "unresolvedGroups": [ ], "synced": try, "binlogType": "remote" } } ] } ] }You can see that the task runs normally with no error and all four wrong DDL statements are replaced.