TiDB 7.5.0 Release Notes
Get future Long-Term Support (LTS) release notices via email:
Release date: December 1, 2023
TiDB version: 7.5.0
Quick access: Quick start | Production deployment
TiDB 7.5.0 is a Long-Term Support Release (LTS).
Compared with the previous LTS 7.1.0, 7.5.0 includes new features, improvements, and bug fixes released in 7.2.0-DMR, 7.3.0-DMR, and 7.4.0-DMR. When you upgrade from 7.1.x to 7.5.0, you can download the TiDB Release Notes PDF to view all release notes between the two LTS versions. The following table lists some highlights from 7.2.0 to 7.5.0:
Feature details
Scalability
Support designating and isolating TiDB nodes to distributedly execute
ADD INDEXorIMPORT INTOtasks when the Distributed eXecution Framework (DXF) is enabled #46258 @ywqzzyExecuting
ADD INDEXorIMPORT INTOtasks in parallel in a resource-intensive cluster can consume a large amount of TiDB node resources, which can lead to cluster performance degradation. To avoid performance impact on existing services, v7.4.0 introduces the system variabletidb_service_scopeas an experimental feature to control the service scope of each TiDB node under the TiDB Distributed eXecution Framework (DXF). You can select several existing TiDB nodes or set the TiDB service scope for new TiDB nodes, and all distributedly executedADD INDEXandIMPORT INTOtasks only run on these nodes. In v7.5.0, this feature becomes generally available (GA).For more information, see documentation.
Performance
The TiDB Distributed eXecution Framework (DXF) becomes generally available (GA), improving the performance and stability of
ADD INDEXandIMPORT INTOtasks in parallel execution #45719 @wjhuang2016The DXF introduced in v7.1.0 has become GA. In versions before TiDB v7.1.0, only one TiDB node can execute DDL tasks at the same time. Starting from v7.1.0, multiple TiDB nodes can execute the same DDL task in parallel under the DXF. Starting from v7.2.0, the DXF supports multiple TiDB nodes to execute the same
IMPORT INTOtask in parallel, thereby better utilizing the resources of the TiDB cluster and significantly improving the performance of DDL andIMPORT INTOtasks. In addition, you can also increase TiDB nodes to linearly improve the performance of these tasks.To use the DXF, set
tidb_enable_dist_taskvalue toON.SET GLOBAL tidb_enable_dist_task = ON;For more information, see documentation.
Improve the performance of adding multiple indexes in a single SQL statement #41602 @tangenta
Before v7.5.0, when you add multiple indexes (
ADD INDEX) in a single SQL statement, the performance was similar to adding multiple indexes using separate SQL statements. Starting from v7.5.0, the performance of adding multiple indexes in a single SQL statement is significantly improved. Especially in scenarios with wide tables, internal test data shows that performance can be improved by up to 94%.
DB operations
DDL jobs support pause and resume operations (GA) #18015 @godouxm
The pause and resume operations for DDL jobs introduced in v7.2.0 become generally available (GA). These operations let you pause resource-intensive DDL jobs (such as creating indexes) to save resources and minimize the impact on online traffic. When resources permit, you can seamlessly resume DDL jobs without canceling and restarting them. This feature improves resource utilization, enhances user experience, and simplifies the schema change process.
You can pause and resume multiple DDL jobs using
ADMIN PAUSE DDL JOBSorADMIN RESUME DDL JOBS:ADMIN PAUSE DDL JOBS 1,2; ADMIN RESUME DDL JOBS 1,2;For more information, see documentation.
BR supports backing up and restoring statistics #48008 @Leavrth
Starting from TiDB v7.5.0, the br command-line tool introduces the
--ignore-statsparameter to back up and restore database statistics. When you set this parameter tofalse, the br command-line tool supports backing up and restoring statistics of columns, indexes, and tables. In this case, you do not need to manually run the statistics collection task for the TiDB database restored from the backup, or wait for the completion of automatic collection tasks. This feature simplifies database maintenance work and improves query performance.For more information, see documentation.
Observability
TiDB Dashboard supports heap profiling for TiKV #15927 @Connor1996
Previously, addressing TiKV OOM or high memory usage issues typically required manual execution of
jeprofto generate a heap profile in the instance environment. Starting from v7.5.0, TiKV enables remote processing of heap profiles. You can now directly access the flame graph and call graph of heap profile. This feature provides the same simple and easy-to-use experience as Go heap profiling.For more information, see documentation.
Data migration
Support the
IMPORT INTOSQL statement (GA) #46704 @D3HunterIn v7.5.0, the
IMPORT INTOSQL statement becomes generally available (GA). This statement integrates the Physical Import Mode capability of TiDB Lightning and allows you to quickly import data in formats such as CSV, SQL, and PARQUET into an empty table in TiDB. This import method eliminates the need for a separate deployment and management of TiDB Lightning, thereby reducing the complexity of data import and greatly improving import efficiency.For more information, see documentation.
Data Migration (DM) supports blocking incompatible (data-consistency-corrupting) DDL changes #9692 @GMHDBJD
Before v7.5.0, the DM Binlog Filter feature can only migrate or filter specified events, and the granularity is relatively coarse. For example, it can only filter large granularity of DDL events such as
ALTER. This method is limited in some scenarios. For example, the application allowsADD COLUMNbut notDROP COLUMN, but they are both filtered byALTERevents in the earlier DM versions.To address such issues, v7.5.0 refines the granularity of the supported DDL events, such as support filtering
MODIFY COLUMN(modify the column data type),DROP COLUMN, and other fine-grained DDL events that lead to data loss, truncation of data, and loss of precision. You can configure it as needed. This feature also supports blocking incompatible DDL changes and reporting errors for such changes, so that you can intervene manually in time to avoid impacting downstream application data.For more information, see documentation.
Support real-time checkpoint updates for continuous data validation #8463 @lichunzhu
Before v7.5.0, the continuous data validation feature ensures the data consistency during replication from DM to downstream. This serves as the basis for cutting over business traffic from the upstream database to TiDB. However, due to various factors such as replication delay and waiting for re-validation of inconsistent data, the continuous validation checkpoint must be refreshed every few minutes. This is unacceptable for some business scenarios where the cutover time is limited to tens of seconds.
With the introduction of real-time updating of checkpoint for continuous data validation, you can now provide the binlog position from the upstream database. Once the continuous validation program detects this binlog position in memory, it immediately refreshes the checkpoint instead of refreshing it every few minutes. Therefore, you can quickly perform cut-off operations based on this immediately updated checkpoint.
For more information, see documentation.
Compatibility changes
System variables
Configuration file parameters
Offline package changes
Starting from v7.5.0, the following contents are removed from the TiDB-community-toolkit binary package:
tikv-importer-{version}-linux-{arch}.tar.gzmydumperspark-{version}-any-any.tar.gztispark-{version}-any-any.tar.gz
Deprecated features
Mydumper is deprecated in v7.5.0 and most of its features have been replaced by Dumpling. It is strongly recommended that you use Dumpling instead of Mydumper.
TiKV-importer is deprecated in v7.5.0. It is strongly recommended that you use the Physical Import Mode of TiDB Lightning as an alternative.
Starting from TiDB v7.5.0, technical support for the data replication feature of TiDB Binlog is no longer provided. It is strongly recommended to use TiCDC as an alternative solution for data replication. Although TiDB Binlog v7.5.0 still supports the Point-in-Time Recovery (PITR) scenario, this component will be completely deprecated in future versions. It is recommended to use PITR as an alternative solution for data recovery.
The
Fast Analyzefeature (experimental) for statistics is deprecated in v7.5.0.The incremental collection feature (experimental) for statistics is deprecated in v7.5.0.
Improvements
TiDB
- Optimize the concurrency model of merging GlobalStats: introduce
tidb_enable_async_merge_global_statsto enable simultaneous loading and merging of statistics, which speeds up the generation of GlobalStats on partitioned tables. Optimize the memory usage of merging GlobalStats to avoid OOM and reduce memory allocations. #47219 @hawkingrei - Optimize the
ANALYZEprocess: introducetidb_build_sampling_stats_concurrencyto better control theANALYZEconcurrency to reduce resource consumption. Optimize the memory usage ofANALYZEto reduce memory allocation and avoid frequent GC by reusing some intermediate results. #47275 @hawkingrei - Optimize the use of placement policies: support configuring the range of a policy to global and improve the syntax support for common scenarios. #45384 @nolouch
- Improve the performance of adding indexes with
tidb_ddl_enable_fast_reorgenabled. In internal tests, v7.5.0 improves the performance by up to 62.5% compared with v6.5.0. #47757 @tangenta
- Optimize the concurrency model of merging GlobalStats: introduce
TiKV
- Avoid holding mutex when writing Titan manifest files to prevent affecting other threads #15351 @Connor1996
PD
- Improve the stability and usability of the
evict-slow-trendscheduler #7156 @LykxSassinato
- Improve the stability and usability of the
Tools
Backup & Restore (BR)
- Add a new inter-table backup parameter
table-concurrencyfor snapshot backups. This parameter is used to control the inter-table concurrency of meta information such as statistics backup and data validation #48571 @3pointer - During restoring a snapshot backup, BR retries when it encounters certain network errors #48528 @Leavrth
- Add a new inter-table backup parameter
Bug fixes
TiDB
- Prohibit split table operations on non-integer clustered indexes #47350 @tangenta
- Fix the issue of encoding time fields with incorrect timezone information #46033 @tangenta
- Fix the issue that the Sort operator might cause TiDB to crash during the spill process #47538 @windtalker
- Fix the issue that TiDB returns
Can't find columnfor queries withGROUP_CONCAT#41957 @AilinKid - Fix the panic issue of
batch-clientinclient-go#47691 @crazycs520 - Fix the issue of incorrect memory usage estimation in
INDEX_LOOKUP_HASH_JOIN#47788 @SeaRise - Fix the issue of uneven workload caused by the rejoining of a TiFlash node that has been offline for a long time #35418 @windtalker
- Fix the issue that the chunk cannot be reused when the HashJoin operator performs probe #48082 @wshwsh12
- Fix the issue that the
COALESCE()function returns incorrect result type forDATEtype parameters #46475 @xzhangxian1008 - Fix the issue that
UPDATEstatements with subqueries are incorrectly converted to PointGet #48171 @hi-rustin - Fix the issue that incorrect results are returned when the cached execution plans contain the comparison between date types and
unix_timestamp#48165 @qw4990 - Fix the issue that an error is reported when default inline common table expressions (CTEs) with aggregate functions or window functions are referenced by recursive CTEs #47881 @elsa0520
- Fix the issue that the optimizer mistakenly selects IndexFullScan to reduce sort introduced by window functions #46177 @qw4990
- Fix the issue that multiple references to CTEs result in incorrect results due to condition pushdown of CTEs #47881 @winoros
- Fix the issue that the MySQL compression protocol cannot handle large loads of data (>=16M) #47152 #47157 #47161 @dveeden
- Fix the issue that TiDB does not read
cgroupresource limits when it is started withsystemd#47442 @hawkingrei
TiKV
- Fix the issue that retrying prewrite requests in the pessimistic transaction mode might cause the risk of data inconsistency in rare cases #11187 @MyonKeminta
PD
- Fix the issue that
evict-leader-schedulermight lose configuration #6897 @HuSharp - Fix the issue that after a store goes offline, the monitoring metric of its statistics is not deleted #7180 @rleungx
- Fix the issue that
canSyncandhasMajoritymight be calculated incorrectly for clusters adopting the Data Replication Auto Synchronous (DR Auto-Sync) mode when the configuration of Placement Rules is complex #7201 @disksing - Fix the issue that the rule checker does not add Learners according to the configuration of Placement Rules #7185 @nolouch
- Fix the issue that TiDB Dashboard cannot read PD
tracedata correctly #7253 @nolouch - Fix the issue that PD might panic due to empty Regions obtained internally #7261 @lhy1024
- Fix the issue that
available_storesis calculated incorrectly for clusters adopting the Data Replication Auto Synchronous (DR Auto-Sync) mode #7221 @disksing - Fix the issue that PD might delete normal Peers when TiKV nodes are unavailable #7249 @lhy1024
- Fix the issue that adding multiple TiKV nodes to a large cluster might cause TiKV heartbeat reporting to become slow or stuck #7248 @rleungx
- Fix the issue that
TiFlash
- Fix the issue that the
UPPER()andLOWER()functions return inconsistent results between TiDB and TiFlash #7695 @windtalker - Fix the issue that executing queries on empty partitions causes query failure #8220 @JaySon-Huang
- Fix the panic issue caused by table creation failure when replicating TiFlash replicas #8217 @hongyunyan
- Fix the issue that the
Tools
Backup & Restore (BR)
TiCDC
- Fix the performance issue caused by accessing NFS directories when replicating data to an object store sink #10041 @CharlesCheung96
- Fix the issue that the storage path is misspelled when
claim-checkis enabled #10036 @3AceShowHand - Fix the issue that TiCDC scheduling is not balanced in some cases #9845 @3AceShowHand
- Fix the issue that TiCDC might get stuck when replicating data to Kafka #9855 @hicqu
- Fix the issue that the TiCDC processor might panic in some cases #9849 #9915 @hicqu @3AceShowHand
- Fix the issue that enabling
kv-client.enable-multiplexingcauses replication tasks to get stuck #9673 @fubinzh - Fix the issue that an owner node gets stuck due to NFS failure when the redo log is enabled #9886 @3AceShowHand
Performance test
To learn about the performance of TiDB v7.5.0, you can refer to the TPC-C performance test report and Sysbench performance test report of the TiDB Cloud Dedicated cluster.
Contributors
We would like to thank the following contributors from the TiDB community: