Upgrade TiDB Using TiUP
This document applies to upgrading to TiDB v8.4.0 from the following versions: v6.1.x, v6.5.x, v7.1.x, v7.5.x, v8.1.x, v8.2.0, and v8.3.0
Upgrade caveat
- TiDB currently does not support version downgrade or rolling back to an earlier version after the upgrade.
- Support upgrading the versions of TiCDC, TiFlash, and other components.
- When upgrading TiFlash from versions earlier than v6.3.0 to v6.3.0 and later versions, note that the CPU must support the AVX2 instruction set under the Linux AMD64 architecture and the ARMv8 instruction set architecture under the Linux ARM64 architecture. For details, see the description in v6.3.0 Release Notes.
- For detailed compatibility changes of different versions, see the Release Notes of each version. Modify your cluster configuration according to the "Compatibility Changes" section of the corresponding release notes.
- When updating clusters from versions earlier than v5.3 to v5.3 or later versions, note that there is a time format change in the alerts generated by the default deployed Prometheus. This format change is introduced starting from Prometheus v2.27.1. For more information, see Prometheus commit.
Preparations
This section introduces the preparation works needed before upgrading your TiDB cluster, including upgrading TiUP and the TiUP Cluster component.
Step 1: Review compatibility changes
Review the compatibility changes in TiDB v8.4.0 release notes. If any changes affect your upgrade, take actions accordingly.
Step 2: Upgrade TiUP or TiUP offline mirror
Before upgrading your TiDB cluster, you first need to upgrade TiUP or TiUP mirror.
Upgrade TiUP and TiUP Cluster
Upgrade the TiUP version. It is recommended that the TiUP version is
1.11.3
or later.tiup update --self tiup --versionUpgrade the TiUP Cluster version. It is recommended that the TiUP Cluster version is
1.11.3
or later.tiup update cluster tiup cluster --version
Upgrade TiUP offline mirror
Refer to Deploy a TiDB Cluster Using TiUP - Deploy TiUP offline to download the TiUP mirror of the new version and upload it to the control machine. After executing local_install.sh
, TiUP will complete the overwrite upgrade.
tar xzvf tidb-community-server-${version}-linux-amd64.tar.gz
sh tidb-community-server-${version}-linux-amd64/local_install.sh
source /home/tidb/.bash_profile
After the overwrite upgrade, run the following command to merge the server and toolkit offline mirrors to the server directory:
tar xf tidb-community-toolkit-${version}-linux-amd64.tar.gz
ls -ld tidb-community-server-${version}-linux-amd64 tidb-community-toolkit-${version}-linux-amd64
cd tidb-community-server-${version}-linux-amd64/
cp -rp keys ~/.tiup/
tiup mirror merge ../tidb-community-toolkit-${version}-linux-amd64
After merging the mirrors, run the following command to upgrade the TiUP Cluster component:
tiup update cluster
Now, the offline mirror has been upgraded successfully. If an error occurs during TiUP operation after the overwriting, it might be that the manifest
is not updated. You can try rm -rf ~/.tiup/manifests/*
before running TiUP again.
Step 3: Edit TiUP topology configuration file
Enter the
vi
editing mode to edit the topology file:tiup cluster edit-config <cluster-name>Refer to the format of topology configuration template and fill the parameters you want to modify in the
server_configs
section of the topology file.After the modification, enter : + w + q to save the change and exit the editing mode. Enter Y to confirm the change.
Step 4: Check the DDL and backup status of the cluster
To avoid undefined behaviors or other unexpected problems during the upgrade, it is recommended to check the following items before the upgrade.
Cluster DDLs:
- If you use smooth upgrade, you do not need to check the DDL operations of your TiDB cluster. You do not need to wait for the completion of DDL jobs or cancel ongoing DDL jobs.
- If you do not use smooth upgrade, it is recommended to use the
ADMIN SHOW DDL
statement to check whether ongoing DDL jobs exist. If an ongoing DDL job exists, wait for the completion of its execution or cancel it using theADMIN CANCEL DDL
statement before performing an upgrade.
Cluster backup: It is recommended to execute the
SHOW [BACKUPS|RESTORES]
statement to check whether there is an ongoing backup or restore task in the cluster. If yes, wait for its completion before performing an upgrade.
Step 5: Check the health status of the current cluster
To avoid the undefined behaviors or other issues during the upgrade, it is recommended to check the health status of Regions of the current cluster before the upgrade. To do that, you can use the check
sub-command.
tiup cluster check <cluster-name> --cluster
After the command is executed, the "Region status" check result will be output.
- If the result is "All Regions are healthy", all Regions in the current cluster are healthy and you can continue the upgrade.
- If the result is "Regions are not fully healthy: m miss-peer, n pending-peer" with the "Please fix unhealthy regions before other operations." prompt, some Regions in the current cluster are abnormal. You need to troubleshoot the anomalies until the check result becomes "All Regions are healthy". Then you can continue the upgrade.
Upgrade the TiDB cluster
This section describes how to upgrade the TiDB cluster and verify the version after the upgrade.
Upgrade the TiDB cluster to a specified version
You can upgrade your cluster in one of the two ways: online upgrade and offline upgrade.
By default, TiUP Cluster upgrades the TiDB cluster using the online method, which means that the TiDB cluster can still provide services during the upgrade process. With the online method, the leaders are migrated one by one on each node before the upgrade and restart. Therefore, for a large-scale cluster, it takes a long time to complete the entire upgrade operation.
If your application has a maintenance window for the database to be stopped for maintenance, you can use the offline upgrade method to quickly perform the upgrade operation.
Online upgrade
tiup cluster upgrade <cluster-name> <version>
For example, if you want to upgrade the cluster to v8.4.0:
tiup cluster upgrade <cluster-name> v8.4.0
Specify the component version during upgrade
Starting from tiup-cluster v1.14.0, you can specify certain components to a specific version during cluster upgrade. These components will remain at their fixed version in the subsequent upgrade unless you specify a different version.
tiup cluster upgrade -h | grep "version"
--alertmanager-version string Fix the version of alertmanager and no longer follows the cluster version.
--blackbox-exporter-version string Fix the version of blackbox-exporter and no longer follows the cluster version.
--cdc-version string Fix the version of cdc and no longer follows the cluster version.
--ignore-version-check Ignore checking if target version is bigger than current version.
--node-exporter-version string Fix the version of node-exporter and no longer follows the cluster version.
--pd-version string Fix the version of pd and no longer follows the cluster version.
--tidb-dashboard-version string Fix the version of tidb-dashboard and no longer follows the cluster version.
--tiflash-version string Fix the version of tiflash and no longer follows the cluster version.
--tikv-cdc-version string Fix the version of tikv-cdc and no longer follows the cluster version.
--tikv-version string Fix the version of tikv and no longer follows the cluster version.
--tiproxy-version string Fix the version of tiproxy and no longer follows the cluster version.
Offline upgrade
Before the offline upgrade, you first need to stop the entire cluster.
tiup cluster stop <cluster-name>Use the
upgrade
command with the--offline
option to perform the offline upgrade. Fill in the name of your cluster for<cluster-name>
and the version to upgrade to for<version>
, such asv8.4.0
.tiup cluster upgrade <cluster-name> <version> --offlineAfter the upgrade, the cluster will not be automatically restarted. You need to use the
start
command to restart it.tiup cluster start <cluster-name>
Verify the cluster version
Execute the display
command to view the latest cluster version TiDB Version
:
tiup cluster display <cluster-name>
Cluster type: tidb
Cluster name: <cluster-name>
Cluster version: v8.4.0
FAQ
This section describes common problems encountered when updating the TiDB cluster using TiUP.
If an error occurs and the upgrade is interrupted, how to resume the upgrade after fixing this error?
Re-execute the tiup cluster upgrade
command to resume the upgrade. The upgrade operation restarts the nodes that have been previously upgraded. If you do not want the upgraded nodes to be restarted, use the replay
sub-command to retry the operation:
Execute
tiup cluster audit
to see the operation records:tiup cluster auditFind the failed upgrade operation record and keep the ID of this operation record. The ID is the
<audit-id>
value in the next step.Execute
tiup cluster replay <audit-id>
to retry the corresponding operation:tiup cluster replay <audit-id>
How to fix the issue that the upgrade gets stuck when upgrading to v6.2.0 or later versions?
Starting from v6.2.0, TiDB enables the concurrent DDL framework by default to execute concurrent DDLs. This framework changes the DDL job storage from a KV queue to a table queue. This change might cause the upgrade to get stuck in some scenarios. The following are some scenarios that might trigger this issue and the corresponding solutions:
Upgrade gets stuck due to plugin loading
During the upgrade, loading certain plugins that require executing DDL statements might cause the upgrade to get stuck.
Solution: avoid loading plugins during the upgrade. Instead, load plugins only after the upgrade is completed.
Upgrade gets stuck due to using the
kill -9
command for offline upgrade- Precautions: avoid using the
kill -9
command to perform the offline upgrade. If it is necessary, restart the new version TiDB node after 2 minutes. - If the upgrade is already stuck, restart the affected TiDB node. If the issue has just occurred, it is recommended to restart the node after 2 minutes.
- Precautions: avoid using the
Upgrade gets stuck due to DDL Owner change
In multi-instance scenarios, network or hardware failures might cause DDL Owner change. If there are unfinished DDL statements in the upgrade phase, the upgrade might get stuck.
Solution:
- Terminate the stuck TiDB node (avoid using
kill -9
). - Restart the new version TiDB node.
- Terminate the stuck TiDB node (avoid using
The evict leader has waited too long during the upgrade. How to skip this step for a quick upgrade?
You can specify --force
. Then the processes of transferring PD leader and evicting TiKV leader are skipped during the upgrade. The cluster is directly restarted to update the version, which has a great impact on the cluster that runs online. In the following command, <version>
is the version to upgrade to, such as v8.4.0
.
tiup cluster upgrade <cluster-name> <version> --force
How to update the version of tools such as pd-ctl after upgrading the TiDB cluster?
You can upgrade the tool version by using TiUP to install the ctl
component of the corresponding version:
tiup install ctl:v8.4.0