Sink to MySQL (Beta)
This document describes how to stream data from TiDB Cloud Essential to MySQL using the Sink to MySQL changefeed.
Restrictions
- For each TiDB Cloud Essential cluster, you can create up to 10 changefeeds.
- Because TiDB Cloud Essential uses TiCDC to establish changefeeds, it has the same restrictions as TiCDC.
- If the table to be replicated does not have a primary key or a non-null unique index, the absence of a unique constraint during replication could result in duplicated data being inserted downstream in some retry scenarios.
Prerequisites
Before creating a changefeed, you need to complete the following prerequisites:
- Set up your network connection
- Export and load the existing data to MySQL (optional)
- Create corresponding target tables in MySQL if you do not load the existing data and only want to replicate incremental data to MySQL
Network
Make sure that your TiDB Cloud Essential cluster can connect to the MySQL service. You can choose one of the following connection methods:
- Private Link Connection: meeting security compliance and ensuring network quality.
- Public Network: suitable for a quick setup.
Private link connections leverage Private Link technologies from cloud providers, enabling resources in your VPC to connect to services in other VPCs through private IP addresses, as if those services were hosted directly within your VPC.
You can connect your TiDB Cloud Essential cluster to your MySQL service securely through a private link connection. If the private link connection is not available for your MySQL service, follow Connect to Amazon RDS via a Private Link Connection or Connect to Alibaba Cloud ApsaraDB RDS for MySQL via a Private Link Connection to create one.
If your MySQL service can be accessed over the public network, you can choose to connect to MySQL through a public IP or domain name.
Load existing data (optional)
The Sink to MySQL connector can only sink incremental data from your TiDB Cloud Essential cluster to MySQL after a certain timestamp. If you already have data in your TiDB Cloud Essential cluster, you can export and load the existing data of your TiDB Cloud Essential cluster into MySQL before enabling Sink to MySQL.
To load the existing data:
Extend the tidb_gc_life_time to be longer than the total time of the following two operations, so that historical data during this period is not garbage collected by TiDB.
- The time to export and import the existing data
- The time to create Sink to MySQL
For example:
SET GLOBAL tidb_gc_life_time = '72h';Use the Export feature to export data from your TiDB Cloud Essential cluster, then use community tools such as mydumper/myloader to load the data into the MySQL service.
Record the snapshot time returned by Export. Use this timestamp as the starting position when you configure the MySQL sink.
Create target tables in MySQL
If you do not load the existing data, you need to create corresponding target tables in MySQL manually to store the incremental data from TiDB. Otherwise, the data will not be replicated.
Create a MySQL sink
After completing the prerequisites, you can sink your data to MySQL.
Navigate to the overview page of the target TiDB Cloud Essential cluster, and then click Data > Changefeed in the left navigation pane.
Click Create Changefeed, and select MySQL as Destination.
In Connectivity Method, choose the method to connect to your MySQL service.
- If you choose Public, fill in your MySQL endpoint.
- If you choose Private Link, select the private link connection that you created in the Network section, and then fill in the MySQL port for your MySQL service.
In Authentication, fill in the MySQL user name and password, and configure TLS encryption for your MySQL service. Currently, TiDB Cloud does not support self-signed certificates for MySQL TLS connections.
Click Next to test whether TiDB can connect to MySQL successfully:
- If yes, you are directed to the next step of configuration.
- If not, a connectivity error is displayed, and you need to handle the error. After the error is resolved, click Next again.
Customize Table Filter to filter the tables that you want to replicate. For the rule syntax, refer to table filter rules.
- Replication Scope: you can choose to only replicate tables with valid keys or replicate all selected tables.
- Filter Rules: you can set filter rules in this column. By default, there is a rule
*.*, which stands for replicating all tables. When you add a new rule and click Apply, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules under Filter results. - Case Sensitive: you can set whether the matching of database and table names in filter rules is case-sensitive. By default, matching is case-insensitive.
- Filter results with valid keys: this column displays the tables that have valid keys, including primary keys or unique indexes.
- Filter results without valid keys: this column shows tables that lack primary keys or unique keys. These tables present a challenge during replication because the absence of a unique identifier can result in inconsistent data when the downstream handles duplicate events. To ensure data consistency, it is recommended to add unique keys or primary keys to these tables before initiating the replication. Alternatively, you can add filter rules to exclude these tables. For example, you can exclude the table
test.tbl1by using the rule"!test.tbl1".
Customize Event Filter to filter the events that you want to replicate.
- Tables matching: you can set which tables the event filter will be applied to in this column. The rule syntax is the same as that used for the preceding Table Filter area.
- Event Filter: you can choose the events you want to ignore.
In Start Replication Position, configure the starting position for your MySQL sink.
- If you have loaded the existing data using Export, select From Time and fill in the snapshot time returned by Export. Ensure that the time zone is correct.
- If you do not have any data in the upstream TiDB cluster, select Start replication from now on.
Click Next to configure your changefeed.
In the Changefeed Name area, specify a name for the changefeed.
Review the configuration. If all settings are correct, click Submit.
If you want to modify some configurations, click Previous to go back to the previous configuration page.
After creation, the sink status changes from Creating to Running.
Click the changefeed name, and you can see more details about the changefeed, such as the checkpoint, replication latency, and other metrics.
If you have loaded the existing data and increased the GC time, restore it to its original value (the default value is
10m) after the sink is created:SET GLOBAL tidb_gc_life_time = '10m';