TiDB 7.4.0 Release Notes
Release date: October 12, 2023
TiDB version: 7.4.0
Quick access: Quick start
7.4.0 introduces the following key features and improvements:
Category | Feature | Description |
---|---|---|
Reliability and Availability | Improve the performance and stability of IMPORT INTO and ADD INDEX operations via global sort (experimental) | Before v7.4.0, tasks such as ADD INDEX or IMPORT INTO using the TiDB Distributed eXecution Framework (DXF) meant localized and partial sorting, which ultimately led to TiKV doing a lot of extra work to make up for the partial sorting. These jobs also required TiDB nodes to allocate local disk space for sorting, before loading to TiKV.With the introduction of the Global Sort feature in v7.4.0, data is temporarily stored in external shared storage (S3 in this version) for global sorting before being loaded into TiKV. This eliminates the need for TiKV to consume extra resources and significantly improves the performance and stability of operations like ADD INDEX and IMPORT INTO . |
Resource control for background tasks (experimental) | In v7.1.0, the Resource Control feature was introduced to mitigate resource and storage access interference between workloads. TiDB v7.4.0 applies this control to background tasks as well. In v7.4.0, Resource Control now identifies and manages the resources produced by background tasks, such as auto-analyze, Backup & Restore, bulk load with TiDB Lightning, and online DDL. This will eventually apply to all background tasks. | |
TiFlash supports storage-computing separation and S3 (GA) | TiFlash disaggregated storage and compute architecture and S3 shared storage become generally available:
| |
SQL | TiDB supports partition type management | Before v7.4.0, Range/List partitioned tables support partition management operations such as TRUNCATE , EXCHANGE , ADD , DROP , and REORGANIZE , and Hash/Key partitioned tables support partition management operations such as ADD and COALESCE .Now TiDB also supports the following partition type management operations:
|
MySQL 8.0 compatibility: support collation utf8mb4_0900_ai_ci | One notable change in MySQL 8.0 is that the default character set is utf8mb4, and the default collation of utf8mb4 is utf8mb4_0900_ai_ci . TiDB v7.4.0 adding support for this enhances compatibility with MySQL 8.0 so that migrations and replications from MySQL 8.0 databases with the default collation are now much smoother. | |
DB Operations and Observability | Specify the respective TiDB nodes to execute the IMPORT INTO and ADD INDEX SQL statements (experimental) | You have the flexibility to specify whether to execute IMPORT INTO or ADD INDEX SQL statements on some of the existing TiDB nodes or newly added TiDB nodes. This approach enables resource isolation from the rest of the TiDB nodes, preventing any impact on business operations while ensuring optimal performance for executing the preceding SQL statements. |
Feature details
Scalability
Support selecting the TiDB nodes to parallelly execute the backend
ADD INDEX
orIMPORT INTO
tasks of the Distributed eXecution Framework (DXF) (experimental) #46453 @ywqzzyExecuting
ADD INDEX
orIMPORT INTO
tasks in parallel in a resource-intensive cluster can consume a large amount of TiDB node resources, which can lead to cluster performance degradation. Starting from v7.4.0, you can use the system variabletidb_service_scope
to control the service scope of each TiDB node under the TiDB Distributed eXecution Framework (DXF). You can select several existing TiDB nodes or set the TiDB service scope for new TiDB nodes, and all parallelADD INDEX
andIMPORT INTO
tasks only run on these nodes. This mechanism can avoid performance impact on existing services.For more information, see documentation.
Enhance the Partitioned Raft KV storage engine (experimental) #11515 #12842 @busyjay @tonyxuqqi @tabokie @bufferflies @5kbpers @SpadeA-Tang @nolouch
TiDB v6.6.0 introduces the Partitioned Raft KV storage engine as an experimental feature, which uses multiple RocksDB instances to store TiKV Region data, and the data of each Region is independently stored in a separate RocksDB instance.
In v7.4.0, TiDB further improves the compatibility and stability of the Partitioned Raft KV storage engine. Through large-scale data testing, the compatibility with TiDB ecosystem tools and features such as DM, Dumpling, TiDB Lightning, TiCDC, BR, and PITR is ensured. Additionally, the Partitioned Raft KV storage engine provides more stable performance under mixed read and write workloads, making it especially suitable for write-heavy scenarios. Furthermore, each TiKV node now supports 8 core CPUs and can be configured with 8 TB data storage, and 64 GB memory.
For more information, see documentation.
TiFlash supports the disaggregated storage and compute architecture (GA) #6882 @JaySon-Huang @JinheLin @breezewish @lidezhu @CalvinNeo @Lloyd-Pottiger
In v7.0.0, TiFlash introduces the disaggregated storage and compute architecture as an experimental feature. With a series of improvements, the disaggregated storage and compute architecture for TiFlash becomes GA starting from v7.4.0.
In this architecture, TiFlash nodes are divided into two types (Compute Nodes and Write Nodes) and support object storage that is compatible with S3 API. Both types of nodes can be independently scaled for computing or storage capacities. In the disaggregated storage and compute architecture, you can use TiFlash in the same way as the coupled storage and compute architecture, such as creating TiFlash replicas, querying data, and specifying optimizer hints.
Note that the TiFlash disaggregated storage and compute architecture and coupled storage and compute architecture cannot be used in the same cluster or converted to each other. You can configure which architecture to use when you deploy TiFlash.
For more information, see documentation.
Performance
Support pushing down the JSON operator
MEMBER OF
to TiKV #46307 @wshwsh12value MEMBER OF(json_array)
For more information, see documentation.
Support pushing down window functions with any frame definition type to TiFlash #7376 @xzhangxian1008
Before v7.4.0, TiFlash does not support window functions containing
PRECEDING
orFOLLOWING
, and all window functions containing such frame definitions cannot be pushed down to TiFlash. Starting from v7.4.0, TiFlash supports frame definitions of all window functions. This feature is enabled automatically, and window functions containing frame definitions will be automatically pushed down to TiFlash for execution when the related requirements are met.Introduce cloud storage-based global sort capability to improve the performance and stability of
ADD INDEX
andIMPORT INTO
tasks in parallel execution (experimental) #45719 @wjhuang2016Before v7.4.0, when executing tasks like
ADD INDEX
orIMPORT INTO
in the Distributed eXecution Framework (DXF), each TiDB node needs to allocate a significant amount of local disk space for sorting encoded index KV pairs and table data KV pairs. However, due to the lack of global sorting capability, there might be overlapping data between different TiDB nodes and within each individual node during the process. As a result, TiKV has to constantly perform compaction operations while importing these KV pairs into its storage engine, which impacts the performance and stability ofADD INDEX
andIMPORT INTO
.In v7.4.0, TiDB introduces the Global Sort feature. Instead of writing the encoded data locally and sorting it there, the data is now written to cloud storage for global sorting. Once sorted, both the indexed data and table data are imported into TiKV in parallel, thereby improving performance and stability.
For more information, see documentation.
Support caching execution plans for non-prepared statements (GA) #36598 @qw4990
TiDB v7.0.0 introduces non-prepared plan cache as an experimental feature to improve the load capacity of concurrent OLTP. In v7.4.0, this feature becomes GA. The execution plan cache will be applied to more scenarios, thereby improving the concurrent processing capacity of TiDB.
Enabling the non-prepared plan cache might incur additional memory and CPU overhead and might not be suitable for all situations. Starting from v7.4.0, this feature is disabled by default. You can enable it using
tidb_enable_non_prepared_plan_cache
and control the cache size usingtidb_session_plan_cache_size
.Additionally, this feature does not support DML statements by default and has certain restrictions on SQL statements. For more details, see Restrictions.
For more information, see documentation.
Reliability
TiFlash supports query-level data spilling #7738 @windtalker
Starting from v7.0.0, TiFlash supports controlling data spilling for three operators:
GROUP BY
,ORDER BY
, andJOIN
. This feature prevents issues such as query termination or system crashes when the data size exceeds the available memory. However, managing spilling for each operator individually can be cumbersome and ineffective for overall resource control.In v7.4.0, TiFlash introduces the query-level data spilling. By setting the memory limit for a query on a TiFlash node using
tiflash_mem_quota_query_per_node
and the memory ratio that triggers data spilling usingtiflash_query_spill_ratio
, you can conveniently manage the memory usage of a query and have better control over TiFlash memory resources.For more information, see documentation.
Support user-defined TiKV read timeout #45380 @crazycs520
Normally, TiKV processes requests very quickly, in a matter of milliseconds. However, when a TiKV node encounters disk I/O jitter or network latency, the request processing time can increase significantly. In versions earlier than v7.4.0, the timeout limit for TiKV requests is fixed and unadjustable. Hence, TiDB has to wait for a fixed-duration timeout response when a TiKV node encounters issues, which results in a noticeable impact on application query performance during jitter.
TiDB v7.4.0 introduces a new system variable
tikv_client_read_timeout
, which lets you customize the timeout for RPC read requests that TiDB sends to TiKV in a query. It means that when the request sent to a TiKV node is delayed due to disk or network issues, TiDB can time out faster and resend the request to other TiKV nodes, thus reducing query latency. If timeouts occur for all TiKV nodes, TiDB will retry using the default timeout. Additionally, you can also use the optimizer hint/*+ SET_VAR(TIKV_CLIENT_READ_TIMEOUT=N) */
in a query to set the timeout for TiDB to send a TiKV RPC read request. This enhancement gives TiDB the flexibility to adapt to unstable network or storage environments, improving query performance and enhancing the user experience.For more information, see documentation.
Support temporarily modifying some system variable values using an optimizer hint #45892 @winoros
TiDB v7.4.0 introduces the optimizer hint
SET_VAR()
, which is similar to that of MySQL 8.0. By including the hintSET_VAR()
in SQL statements, you can temporarily modify the value of system variables during statement execution. This helps you set the environment for different statements. For example, you can actively increase the parallelism of resource-intensive SQL statements or change the optimizer behavior through variables.You can find the system variables that can be modified using the hint
SET_VAR()
in system variables. It is strongly recommended not to modify variables that are not explicitly supported, as this might cause unpredictable behavior.For more information, see documentation.
TiFlash supports resource control #7660 @guo-shaoge
In TiDB v7.1.0, the resource control feature becomes generally available and provides resource management capabilities for TiDB and TiKV. In v7.4.0, TiFlash supports the resource control feature, improving the overall resource management capabilities of TiDB. The resource control of TiFlash is fully compatible with the existing TiDB resource control feature, and the existing resource groups will manage the resources of TiDB, TiKV, and TiFlash at the same time.
To control whether to enable the TiFlash resource control feature, you can configure the TiFlash parameter
enable_resource_control
. After enabling this feature, TiFlash performs resource scheduling and management based on the resource group configuration of TiDB, ensuring the reasonable allocation and use of overall resources.For more information, see documentation.
TiFlash supports the pipeline execution model (GA) #6518 @SeaRise
Starting from v7.2.0, TiFlash introduces a pipeline execution model. This model centrally manages all thread resources and schedules task execution uniformly, maximizing the utilization of thread resources while avoiding resource overuse. In v7.4.0, TiFlash improves the statistics of thread resource usage, and the pipeline execution model becomes a GA feature and is enabled by default. Since this feature is mutually dependent with the TiFlash resource control feature, TiDB v7.4.0 removes the variable
tidb_enable_tiflash_pipeline_model
used to control whether to enable the pipeline execution model in previous versions. Instead, you can enable or disable the pipeline execution model and the TiFlash resource control feature by configuring the TiFlash parametertidb_enable_resource_control
.For more information, see documentation.
Add the option of optimizer mode #46080 @time-and-fate
In v7.4.0, TiDB introduces a new system variable
tidb_opt_objective
, which controls the estimation method used by the optimizer. The default valuemoderate
maintains the previous behavior of the optimizer, where it uses runtime statistics to adjust estimations based on data changes. If this variable is set todeterminate
, the optimizer generates execution plans solely based on statistics without considering runtime corrections.For long-term stable OLTP applications or situations where you are confident in the existing execution plans, it is recommended to switch to
determinate
mode after testing. This reduces potential plan changes.For more information, see documentation.
TiDB resource control supports managing background tasks (experimental) #44517 @glorv
Background tasks, such as data backup and automatic statistics collection, are low-priority but consume many resources. These tasks are usually triggered periodically or irregularly. During execution, they consume a lot of resources, thus affecting the performance of online high-priority tasks. Starting from v7.4.0, the TiDB resource control feature supports managing background tasks. This feature reduces the performance impact of low-priority tasks on online applications, enabling rational resource allocation, and greatly improving cluster stability.
TiDB supports the following types of background tasks:
lightning
: perform import tasks using TiDB Lightning orIMPORT INTO
.br
: perform backup and restore tasks using BR. PITR is not supported.ddl
: control the resource usage during the batch data write back phase of Reorg DDLs.stats
: the collect statistics tasks that are manually executed or automatically triggered by TiDB.
By default, the task types that are marked as background tasks are empty, and the management of background tasks is disabled. This default behavior is the same as that of versions prior to TiDB v7.4.0. To manage background tasks, you need to manually modify the background task types of the
default
resource group.For more information, see documentation.
Lock statistics becomes generally available (GA) #46351 @hi-rustin
In v7.4.0, lock statistics becomes generally available. Now, to ensure operational security, locking and unlocking statistics require the same privileges as collecting statistics. In addition, TiDB supports locking and unlocking statistics for specific partitions, providing greater flexibility. If you are confident in queries and execution plans in the database and want to prevent any changes from occurring, you can lock statistics to enhance stability.
For more information, see documentation.
Introduce a system variable to control whether to select hash joins for tables #46695 @coderplay
MySQL 8.0 introduces hash joins for tables as a new feature. This feature is primarily used to join two relatively large tables and result sets. However, for transactional workloads, or some applications running on MySQL 5.7, hash joins for tables might pose a performance risk. MySQL provides the
optimizer_switch
to control whether to select hash joins at the global or session level.Starting from v7.4.0, TiDB introduces the system variable
tidb_opt_enable_hash_join
to have control over hash joins for tables. It is enabled by default (ON
). If you are sure that you do not need to select hash joins between tables in your execution plan, you can modify the variable toOFF
to reduce the possibility of execution plan rollbacks and improve system stability.For more information, see documentation.
Memory control for the statistics cache is generally available (GA) #45367 @hawkingrei
TiDB instances can cache table statistics to accelerate execution plan generation and improve SQL performance. Starting from v6.1.0, TiDB introduces the system variable
tidb_stats_cache_mem_quota
. By configuring this system variable, you can set a memory usage limit for the statistics cache. When the cache reaches its limit, TiDB automatically evicts inactive cache entries, helping control instance memory usage and improve stability.Starting from v7.4.0, this feature becomes generally available (GA).
For more information, see documentation.
SQL
TiDB supports partition type management #42728 @mjonss
Before v7.4.0, partition types of partitioned tables in TiDB cannot be modified. Starting from v7.4.0, TiDB supports modifying partitioned tables to non-partitioned tables or non-partitioned tables to partitioned tables, and supports changing partition types. Hence, now you can flexibly adjust the partition type and number for a partitioned table. For example, you can use the
ALTER TABLE t PARTITION BY ...
statement to modify the partition type.For more information, see documentation.
TiDB supports using the
ROLLUP
modifier and theGROUPING
function #44487 @AilinKidThe
WITH ROLLUP
modifier andGROUPING
function are commonly used in data analysis for multi-dimensional data summarization. Starting from v7.4.0, you can use theWITH ROLLUP
modifier andGROUPING
function in theGROUP BY
clause. For example, you can use theWITH ROLLUP
modifier in theSELECT ... FROM ... GROUP BY ... WITH ROLLUP
syntax.For more information, see documentation.
DB operations
Support collation
utf8mb4_0900_ai_ci
andutf8mb4_0900_bin
#37566 @YangKeao @zimulala @bb7133TiDB v7.4.0 enhances the support for migrating data from MySQL 8.0 and adds two collations:
utf8mb4_0900_ai_ci
andutf8mb4_0900_bin
.utf8mb4_0900_ai_ci
is the default collation in MySQL 8.0.TiDB v7.4.0 also introduces the system variable
default_collation_for_utf8mb4
which is compatible with MySQL 8.0. This enables you to specify the default collation for the utf8mb4 character set and provides compatibility with migration or data replication from MySQL 5.7 or earlier versions.For more information, see documentation.
Observability
Support adding session connection IDs and session aliases to logs #46071 @lcwangchao
When you troubleshoot a SQL execution problem, it is often necessary to correlate the contents of TiDB component logs to pinpoint the root cause. Starting from v7.4.0, TiDB can write session connection IDs (
CONNECTION_ID
) to session-related logs, including TiDB logs, slow query logs, and slow logs from the coprocessor on TiKV. You can correlate the contents of several types of logs based on session connection IDs to improve troubleshooting and diagnostic efficiency.In addition, by setting the session-level system variable
tidb_session_alias
, you can add custom identifiers to the logs mentioned above. With this ability to inject your application identification information into the logs, you can correlate the contents of the logs with the application, build the link from the application to the logs, and reduce the difficulty of diagnosis.TiDB Dashboard supports displaying execution plans in a table view #1589 @baurine
In v7.4.0, TiDB Dashboard supports displaying execution plans on the Slow Query and SQL Statement pages in a table view to improve the diagnosis experience.
For more information, see documentation.
Data migration
Enhance the
IMPORT INTO
feature #46704 @D3HunterStarting from v7.4.0, you can add the
CLOUD_STORAGE_URI
option in theIMPORT INTO
statement to enable the Global Sort feature (experimental), which helps boost import performance and stability. In theCLOUD_STORAGE_URI
option, you can specify a cloud storage address for the encoded data.In addition, in v7.4.0, the
IMPORT INTO
feature introduces the following functionalities:- Support configuring the
Split_File
option, which allows you to split a large CSV file into multiple 256 MiB small CSV files for parallel processing, improving import performance. - Support importing compressed CSV and SQL files. The supported compression formats include
.gzip
,.gz
,.zstd
,.zst
, and.snappy
.
For more information, see documentation.
- Support configuring the
Dumpling supports the user-defined terminator when exporting data to CSV files #46982 @GMHDBJD
Before v7.4.0, Dumpling uses
"\r\n"
as the line terminator when exporting data to a CSV file. As a result, certain downstream systems that only recognize"\n"
as the terminator cannot parse the exported CSV file, or have to use a third-party tool for conversion before parsing the file.Starting from v7.4.0, Dumpling introduces a new parameter
--csv-line-terminator
. This parameter allows you to specify a desired terminator when you export data to a CSV file. This parameter supports"\r\n"
and"\n"
. The default terminator is"\r\n"
to keep consistent with earlier versions.For more information, see documentation.
TiCDC supports replicating data to Pulsar #9413 @yumchina @asddongmen
Pulsar is a cloud-native and distributed message streaming platform that significantly enhances your real-time data streaming experience. Starting from v7.4.0, TiCDC supports replicating change data to Pulsar in
canal-json
format to achieve seamless integration with Pulsar. With this feature, TiCDC provides you with the ability to easily capture and replicate TiDB changes to Pulsar, offering new possibilities for data processing and analytics capabilities. You can develop your own consumer applications that read and process newly generated change data from Pulsar to meet specific business needs.For more information, see documentation.
TiCDC improves large message handling with claim-check pattern #9153 @3AceShowHand
Before v7.4.0, TiCDC is unable to send large messages exceeding the maximum message size (
max.message.bytes
) of Kafka to downstream. Starting from v7.4.0, when configuring a changefeed with Kafka as the downstream, you can specify an external storage location for storing the large message, and send a reference message containing the address of the large message in the external storage to Kafka. When consumers receive this reference message, they can retrieve the message content from the external storage address.For more information, see documentation.
Compatibility changes
Behavior changes
Starting with v7.4.0, TiDB is compatible with essential features of MySQL 8.0, and
version()
returns the version prefixed with8.0.11
.After TiFlash is upgraded to v7.4.0 from an earlier version, in-place downgrading to the original version is not supported. This is because, starting from v7.4, TiFlash optimizes the data compaction logic of PageStorage V3 to reduce the read and write amplification generated during data compaction, which leads to changes to some of the underlying storage file names.
A
TIDB_PARSE_TSO_LOGICAL()
function is added to allow the extraction of the logical part of the TSO timestamp.The
information_schema.CHECK_CONSTRAINTS
table is added for improved compatibility with MySQL 8.0.For transactions containing multiple changes, if the primary key or non-null unique index value is modified in the update event, TiCDC splits an event into delete and insert events and ensures that all events follow the sequence of delete events preceding insert events. For more information, see documentation.
System variables
Variable name | Change type | Description |
---|---|---|
tidb_enable_tiflash_pipeline_model | Deleted | This variable was used to control whether to enable the TiFlash pipeline execution model. Starting from v7.4.0, the TiFlash pipeline execution model is automatically enabled when the TiFlash resource control feature is enabled. |
tidb_enable_non_prepared_plan_cache | Modified | Changes the default value from ON to OFF after further tests, meaning that non-prepared execution plan cache is disabled. |
default_collation_for_utf8mb4 | Newly added | Controls the default collation for the utf8mb4 character set. The default value is utf8mb4_bin . |
tidb_cloud_storage_uri | Newly added | Specifies the cloud storage URI to enable Global Sort. |
tidb_opt_enable_hash_join | Newly added | Controls whether the optimizer will select hash joins for tables. The value is ON by default. If set to OFF , the optimizer avoids selecting a hash join of a table unless there is no other execution plan available. |
tidb_opt_objective | Newly added | This variable controls the objective of the optimizer. moderate maintains the default behavior in versions prior to TiDB v7.4.0, where the optimizer tries to use more information to generate better execution plans. determinate mode tends to be more conservative and makes the execution plan more stable. |
tidb_request_source_type | Newly added | Explicitly specifies the task type for the current session, which is identified and controlled by Resource Control. For example: SET @@tidb_request_source_type = "background" . |
tidb_schema_version_cache_limit | Newly added | This variable limits how many historical schema versions can be cached in a TiDB instance. The default value is 16 , which means that TiDB caches 16 historical schema versions by default. |
tidb_service_scope | Newly added | This variable is an instance-level system variable. You can use it to control the service scope of TiDB nodes under the TiDB Distributed eXecution Framework (DXF). When you set tidb_service_scope of a TiDB node to background , the DXF schedules that TiDB node to execute DXF tasks, such as ADD INDEX and IMPORT INTO . |
tidb_session_alias | Newly added | Controls the value of the session_alias column in the logs related to the current session. |
tiflash_mem_quota_query_per_node | Newly added | Limits the maximum memory usage for a query on a TiFlash node. When the memory usage of a query exceeds this limit, TiFlash returns an error and terminates the query. The default value is 0 , which means no limit. |
tiflash_query_spill_ratio | Newly added | Controls the threshold for TiFlash query-level spilling. The default value is 0.7 . |
tikv_client_read_timeout | Newly added | Controls the timeout for TiDB to send a TiKV RPC read request in a query. The default value 0 indicates that the default timeout (usually 40 seconds) is used. |
Configuration file parameters
Configuration file | Configuration parameter | Change type | Description |
---|---|---|---|
TiDB | enable-stats-cache-mem-quota | Modified | The default value is changed from false to true , which means the memory limit for caching TiDB statistics is enabled by default. |
TiKV | rocksdb.[defaultcf|writecf|lockcf].periodic-compaction-seconds | Modified | The default value is changed from "30d" to "0s" to disable periodic compaction of RocksDB by default. This change avoids a significant number of compactions being triggered after the TiDB upgrade, which affects the read and write performance of the frontend. |
TiKV | rocksdb.[defaultcf|writecf|lockcf].ttl | Modified | The default value is changed from "30d" to "0s" so that SST files do not trigger compactions by default due to TTL, which avoids affecting the read and write performance of the frontend. |
TiFlash | flash.compact_log_min_gap | Newly added | When the gap between the applied_index advanced by the current Raft state machine and the applied_index at the last disk spilling exceeds compact_log_min_gap , TiFlash executes the CompactLog command from TiKV and spills data to disk. |
TiFlash | profiles.default.enable_resource_control | Newly added | Controls whether to enable the TiFlash resource control feature. |
TiFlash | storage.format_version | Modified | Change the default value from 4 to 5 . The new format can reduce the number of physical files by merging smaller files. |
Dumpling | --csv-line-terminator | Newly added | Specifies the desired terminator of CSV files . This option supports "\r\n" and "\n" . The default value is "\r\n" , which is consistent with the earlier versions. |
TiCDC | claim-check-storage-uri | Newly added | When large-message-handle-option is set to claim-check , claim-check-storage-uri must be set to a valid external storage address. Otherwise, creating a changefeed results in an error. |
TiCDC | large-message-handle-compression | Newly added | Controls whether to enable compression during encoding. The default value is empty, which means not enabled. |
TiCDC | large-message-handle-option | Modified | This configuration item adds a new value claim-check . When it is set to claim-check , TiCDC Kafka sink supports sending the message to external storage when the message size exceeds the limit and sends a message to Kafka containing the address of this large message in external storage. |
Deprecated and removed features
- Mydumper will be deprecated in v7.5.0 and most of its features have been replaced by Dumpling. It is strongly recommended that you use Dumpling instead of mydumper.
- TiKV-importer will be deprecated in v7.5.0. It is strongly recommended that you use the Physical Import Mode of TiDB Lightning as an alternative.
- The
enable-old-value
parameter of TiCDC is removed. #9667 @3AceShowHand
Improvements
TiDB
- Optimize memory usage and performance for
ANALYZE
operations on partitioned tables #47071 #47104 #46804 @hawkingrei - Optimize memory usage and performance for statistics garbage collection #31778 @winoros
- Optimize the pushdown of
limit
for index merge intersections to improve query performance #46863 @AilinKid - Improve the cost model to minimize the chances of mistakenly choosing a full table scan when
IndexLookup
involves many table retrieval tasks #45132 @qw4990 - Optimize the join elimination rule to improve the query performance of
join on unique keys
#46248 @fixdb - Change the collation of multi-valued index columns to
binary
to avoid execution failure #46717 @YangKeao
- Optimize memory usage and performance for
TiKV
- Optimize memory usage of Resolver to prevent OOM #15458 @overvenus
- Eliminate LRUCache in Router objects to reduce memory usage and prevent OOM #15430 @Connor1996
- Reduce memory usage of TiCDC Resolver #15412 @overvenus
- Reduce memory fluctuations caused by RocksDB compaction #15324 @overvenus
- Reduce memory consumption in the flow control module of Partitioned Raft KV #15269 @overvenus
- Add the backoff mechanism for the PD client in the process of connection retries, which gradually increases retry intervals during error retries to reduce PD pressure #15428 @nolouch
- Support dynamically adjusting
background_compaction
of RocksDB #15424 @glorv
PD
- Optimize TSO tracing information for easier investigation of TSO-related issues #6856 @tiancaiamao
- Support reusing HTTP Client connections to reduce memory usage #6913 @nolouch
- Improve the speed of PD automatically updating cluster status when the backup cluster is disconnected #6883 @disksing
- Enhance the configuration retrieval method of the resource control client to dynamically fetch the latest configurations #7043 @nolouch
TiFlash
- Improve write performance during random write workloads by optimizing the spilling policy of the TiFlash write process #7564 @CalvinNeo
- Add more metrics about the Raft replication process for TiFlash #8068 @CalvinNeo
- Reduce the number of small files to avoid potential exhaustion of file system inodes #7595 @hongyunyan
Tools
Backup & Restore (BR)
- Alleviate the issue that the latency of the PITR log backup progress increases when Region leadership migration occurs #13638 @YuJuncen
- Enhance support for connection reuse of log backup and PITR restore tasks by setting
MaxIdleConns
andMaxIdleConnsPerHost
parameters in the HTTP client #46011 @Leavrth - Improve fault tolerance of BR when it fails to connect to PD or external S3 storage #42909 @Leavrth
- Add a new restore parameter
WaitTiflashReady
. When this parameter is enabled, the restore operation will be completed after TiFlash replicas are successfully replicated #43828 #46302 @3pointer - Reduce the CPU overhead of log backup
resolve lock
#40759 @3pointer
TiCDC
TiDB Lightning
- Optimize the retry logic of TiDB Lightning during the Region scatter phase #46203 @mittalrishabh
- Optimize the retry logic of TiDB Lightning for the
no leader
error during the data import phase #46253 @lance6716
Bug fixes
TiDB
- Fix the issue that the
BatchPointGet
operator returns incorrect results for tables that are not hash partitioned #45889 @Defined2014 - Fix the issue that the
BatchPointGet
operator returns incorrect results for hash partitioned tables #46779 @jiyfhust - Fix the issue that the TiDB parser remains in a state and causes parsing failure #45898 @qw4990
- Fix the issue that
EXCHANGE PARTITION
does not check constraints #45922 @mjonss - Fix the issue that the
tidb_enforce_mpp
system variable cannot be correctly restored #46214 @djshow832 - Fix the issue that the
_
in theLIKE
clause is incorrectly handled #46287 #46618 @Defined2014 - Fix the issue that the
schemaTs
is set to 0 when TiDB fails to obtain the schema #46325 @hihihuhu - Fix the issue that
Duplicate entry
might occur whenAUTO_ID_CACHE=1
is set #46444 @tiancaiamao - Fix the issue that TiDB recovers slowly after a panic when
AUTO_ID_CACHE=1
is set #46454 @tiancaiamao - Fix the issue that the
next_row_id
inSHOW CREATE TABLE
is incorrect whenAUTO_ID_CACHE=1
is set #46545 @tiancaiamao - Fix the panic issue that occurs during parsing when using CTE in subqueries #45838 @djshow832
- Fix the issue that restrictions on partitioned tables remain on the original table when
EXCHANGE PARTITION
fails or is canceled #45920 #45791 @mjonss - Fix the issue that the definition of List partitions does not support using both
NULL
and empty strings #45694 @mjonss - Fix the issue of not being able to detect data that does not comply with partition definitions during partition exchange #46492 @mjonss
- Fix the issue that the
tmp-storage-quota
configuration does not take effect #45161 #26806 @wshwsh12 - Fix the issue that the
WEIGHT_STRING()
function does not match the collation #45725 @dveeden - Fix the issue that an error in Index Join might cause the query to get stuck #45716 @wshwsh12
- Fix the issue that the behavior is inconsistent with MySQL when comparing a
DATETIME
orTIMESTAMP
column with a number constant #38361 @yibin87 - Fix the incorrect result that occurs when comparing unsigned types with
Duration
type constants #45410 @wshwsh12 - Fix the issue that access path pruning logic ignores the
READ_FROM_STORAGE(TIFLASH[...])
hint, which causes theCan't find a proper physical plan
error #40146 @AilinKid - Fix the issue that
GROUP_CONCAT
cannot parse theORDER BY
column #41986 @AilinKid - Fix the issue that HashCode is repeatedly calculated for deeply nested expressions, which causes high memory usage and OOM #42788 @AilinKid
- Fix the issue that the
cast(col)=range
condition causes FullScan when CAST has no precision loss #45199 @AilinKid - Fix the issue that when Aggregation is pushed down through Union in MPP execution plans, the results are incorrect #45850 @AilinKid
- Fix the issue that bindings with
in (?)
cannot matchin (?, ... ?)
#44298 @qw4990 - Fix the error caused by not considering the connection collation when
non-prep plan cache
reuses the execution plan #47008 @qw4990 - Fix the issue that no warning is reported when an executed plan does not hit the plan cache #46159 @qw4990
- Fix the issue that
plan replayer dump explain
reports an error #46197 @time-and-fate - Fix the issue that executing DML statements with CTE can cause panic #46083 @winoros
- Fix the issue that the
TIDB_INLJ
hint does not take effect when joining two sub-queries #46160 @qw4990 - Fix the issue that the results of
MERGE_JOIN
are incorrect #46580 @qw4990
- Fix the issue that the
TiKV
- Fix the issue that TiKV fails to start when Titan is enabled and the
Blob file deleted twice
error occurs #15454 @Connor1996 - Fix the issue of no data in the Thread Voluntary and Thread Nonvoluntary monitoring panels #15413 @SpadeA-Tang
- Fix the data error of continuously increasing raftstore-applys #15371 @Connor1996
- Fix the TiKV panic issue caused by incorrect metadata of Region #13311 @zyguan
- Fix the issue of QPS dropping to 0 after switching from
sync_recovery
tosync
#15366 @nolouch - Fix the issue that Online Unsafe Recovery does not abort on timeout #15346 @Connor1996
- Fix the potential memory leak issue caused by CpuRecord #15304 @overvenus
- Fix the issue that
"Error 9002: TiKV server timeout"
occurs when the backup cluster is down and the primary cluster is queried #12914 @Connor1996 - Fix the issue that the backup TiKV gets stuck when TiKV restarts after the primary cluster recovers #12320 @disksing
- Fix the issue that TiKV fails to start when Titan is enabled and the
PD
- Fix the issue that the Region information is not updated and saved during Flashback #6912 @overvenus
- Fix the issue of slow switching of PD Leaders due to slow synchronization of store config #6918 @bufferflies
- Fix the issue that the groups are not considered in Scatter Peers #6962 @bufferflies
- Fix the issue that RU consumption less than 0 causes PD to crash #6973 @CabinfeverB
- Fix the issue that modified isolation levels are not synchronized to the default placement rules #7121 @rleungx
- Fix the issue that the client-go regularly updating
min-resolved-ts
might cause PD OOM when the cluster is large #46664 @HuSharp
TiFlash
Tools
Backup & Restore (BR)
- Fix an issue that the misleading error message
resolve lock timeout
covers up the actual error when backup fails #43236 @YuJuncen - Fix the issue that recovering implicit primary keys using PITR might cause conflicts #46520 @3pointer
- Fix the issue that recovering meta-kv using PITR might cause errors #46578 @Leavrth
- Fix the errors in BR integration test cases #46561 @purelind
- Fix an issue that the misleading error message
TiCDC
- Fix the issue that TiCDC accesses the invalid old address during PD scaling up and down #9584 @fubinzh @asddongmen
- Fix the issue that changefeed fails in some scenarios #9309 #9450 #9542 #9685 @hicqu @CharlesCheung96
- Fix the issue that replication write conflicts might occur when the unique keys for multiple rows are modified in one transaction on the upstream #9430 @sdojjy
- Fix the issue that a replication error occurs when multiple tables are renamed in the same DDL statement on the upstream #9476 #9488 @CharlesCheung96 @asddongmen
- Fix the issue that Chinese characters are not validated in CSV files #9609 @CharlesCheung96
- Fix the issue that upstream TiDB GC is blocked after all changefeeds are removed #9633 @sdojjy
- Fix the issue of uneven distribution of write keys among nodes when
scale-out
is enabled #9665 @sdojjy - Fix the issue that sensitive user information is recorded in the logs #9690 @sdojjy
TiDB Data Migration (DM)
- Fix the issue that DM cannot handle conflicts correctly with case-insensitive collations #9489 @hihihuhu
- Fix the DM validator deadlock issue and enhance retries #9257 @D3Hunter
- Fix the issue that replication lag returned by DM keeps growing when a failed DDL is skipped and no subsequent DDLs are executed #9605 @D3Hunter
- Fix the issue that DM cannot properly track upstream table schemas when skipping online DDLs #9587 @GMHDBJD
- Fix the issue that DM skips all DMLs when resuming a task in optimistic mode #9588 @GMHDBJD
- Fix the issue that DM skips partition DDLs in optimistic mode #9788 @GMHDBJD
TiDB Lightning
- Fix the issue that inserting data returns an error after TiDB Lightning imports the
NONCLUSTERED auto_increment
andAUTO_ID_CACHE=1
tables #46100 @tiancaiamao - Fix the issue that checksum still reports errors when
checksum = "optional"
#45382 @lyzx2001 - Fix the issue that data import fails when the PD cluster address changes #43436 @lichunzhu
- Fix the issue that inserting data returns an error after TiDB Lightning imports the
Contributors
We would like to thank the following contributors from the TiDB community: