Migrate Data to a Downstream TiDB Table with More Columns
This document provides the additional steps to be taken when you migrate data to a downstream TiDB table with more columns than the corresponding upstream table. For regular migration steps, see the following migration scenarios:
- Migrate MySQL of Small Datasets to TiDB
- Migrate MySQL of Large Datasets to TiDB
- Migrate and Merge MySQL Shards of Small Datasets to TiDB
- Migrate and Merge MySQL Shards of Large Datasets to TiDB
Use DM to migrate data to a downstream TiDB table with more columns
When replicating the upstream binlog, DM tries to use the current table schema of the downstream to parse the binlog and generate the corresponding DML statements. If the column number of the table in the upstream binlog does not match the column number in the downstream table schema, the following error occurs:
"errors": [
{
"ErrCode": 36027,
"ErrClass": "sync-unit",
"ErrScope": "internal",
"ErrLevel": "high",
"Message": "startLocation: [position: (mysql-bin.000001, 2022), gtid-set:09bec856-ba95-11ea-850a-58f2b4af5188:1-9 ], endLocation: [ position: (mysql-bin.000001, 2022), gtid-set: 09bec856-ba95-11ea-850a-58f2b4af5188:1-9]: gen insert sqls failed, schema: log, table: messages: Column count doesn't match value count: 3 (columns) vs 2 (values)",
"RawCause": "",
"Workaround": ""
}
]
The following is an example upstream table schema:
# Upstream table schema
CREATE TABLE `messages` (
`id` int(11) NOT NULL,
PRIMARY KEY (`id`)
)
The following is an example downstream table schema:
# Downstream table schema
CREATE TABLE `messages` (
`id` int(11) NOT NULL,
`message` varchar(255) DEFAULT NULL, # This is the additional column that only exists in the downstream table.
PRIMARY KEY (`id`)
)
When DM tries to use the downstream table schema to parse the binlog event generated by the upstream, DM reports the above Column count doesn't match
error.
In such cases, you can use the operate-schema
command to set a table schema for the table to be migrated from the data source. The specified table schema needs to correspond to the binlog event data to be replicated by DM. If you are migrating sharded tables, for each sharded table, you need to set a table schema in DM to parse binlog event data. The steps are as follows:
Create a SQL file in DM and add the
CREATE TABLE
statement that corresponds to the upstream table schema to the file. For example, save the following table schema tolog.messages.sql
.# Upstream table schema CREATE TABLE `messages` ( `id` int(11) NOT NULL, PRIMARY KEY (`id`) )Use the
operate-schema
command to set the table schema for the table to be migrated from the data source (At this time, the data migration task should be in the Paused state due to the aboveColumn count doesn't match
error).tiup dmctl --master-addr ${advertise-addr} operate-schema set -s ${source-id} ${task-name} -d ${database-name} -t ${table-name} ${schema-file}The descriptions of parameters in this command are as follows:
Parameter Description -master-addr Specifies the ${advertise-addr}
of any DM-master node in the cluster where dmctl is to be connected.${advertise-addr}
indicates the address that DM-master advertises to the outside world.operate-schema set Manually set the schema information. -s Specifies the source. ${source-id}
indicates the source ID of MySQL data.${task-name}
indicates the name of the synchronization task defined in thetask.yaml
configuration file of the data migration task.-d Specifies the database. ${database-name}
indicates the name of the upstream database.-t Specifies table. ${table-name}
indicates the name of the upstream data table.${schema-file}
indicates the table schema file to be set.For example:
tiup dmctl --master-addr 172.16.10.71:8261 operate-schema set -s mysql-01 task-test -d log -t message log.message.sqlUse the
resume-task
command to resume the migration task in the Paused state.tiup dmctl --master-addr ${advertise-addr} resume-task ${task-name}Use the
query-status
command to confirm that the data migration task is running correctly.tiup dmctl --master-addr ${advertise-addr} query-status ${task-name}