TiDB 7.1.0 Release Notes
Release date: May 31, 2023
TiDB version: 7.1.0
Quick access: Quick start | Production deployment
TiDB 7.1.0 is a Long-Term Support Release (LTS).
Compared with the previous LTS 6.5.0, 7.1.0 not only includes new features, improvements, and bug fixes released in 6.6.0-DMR, 7.0.0-DMR, but also introduces the following key features and improvements:
Category | Feature | Description |
---|---|---|
Scalability and Performance | TiFlash supports the disaggregated storage and compute architecture and S3 shared storage (experimental, introduced in v7.0.0) | TiFlash introduces a cloud-native architecture as an option:
|
TiKV supports batch aggregating data requests (introduced in v6.6.0) | This enhancement significantly reduces total RPCs in TiKV batch-get operations. In situations where data is highly dispersed and the gRPC thread pool has insufficient resources, batching coprocessor requests can improve performance by more than 50%. | |
Load-based replica read | In a read hotspot scenario, TiDB can redirect read requests for a hotspot TiKV node to its replicas. This feature efficiently scatters read hotspots and optimizes the use of cluster resources. To control the threshold for triggering load-based replica read, you can adjust the system variable tidb_load_based_replica_read_threshold . | |
TiKV supports partitioned Raft KV storage engine (experimental) | TiKV introduces a new generation of storage engine, the partitioned Raft KV. By allowing each data Region to have a dedicated RocksDB instance, it can expand the cluster's storage capacity from TB-level to PB-level and provide more stable write latency and stronger scalability. | |
Reliability and availability | Resource control by resource groups (GA) | Support resource management based on resource groups, which allocates and isolates resources for different workloads in the same cluster. This feature significantly enhances the stability of multi-application clusters and lays the foundation for multi-tenancy. In v7.1.0, this feature introduces the ability to estimate system capacity based on actual workload or hardware deployment. |
TiFlash supports spill to disk (introduced in v7.0.0) | TiFlash supports intermediate result spill to disk to mitigate OOMs in data-intensive operations such as aggregations, sorts, and hash joins. | |
SQL | Multi-valued indexes (GA) | Support MySQL-compatible multi-valued indexes and enhance the JSON type to improve compatibility with MySQL 8.0. This feature improves the efficiency of membership checks on multi-valued columns. |
Row-level TTL (GA in v7.0.0) | Support managing database size and improve performance by automatically expiring data of a certain age. | |
Generated columns (GA) | Values in a generated column are calculated by a SQL expression in the column definition in real time. This feature pushes some application logic to the database level, thus improving query efficiency. | |
Security | LDAP authentication | TiDB supports LDAP authentication, which is compatible with MySQL 8.0. |
Audit log enhancement (Enterprise Edition only) | TiDB Enterprise Edition enhances the database auditing feature. It significantly improves the system auditing capacity by providing more fine-grained event filtering controls, more user-friendly filter settings, a new file output format in JSON, and lifecycle management of audit logs. |
Feature details
Performance
Enhance the Partitioned Raft KV storage engine (experimental) #11515 #12842 @busyjay @tonyxuqqi @tabokie @bufferflies @5kbpers @SpadeA-Tang @nolouch
TiDB v6.6.0 introduces the Partitioned Raft KV storage engine as an experimental feature, which uses multiple RocksDB instances to store TiKV Region data, and the data of each Region is independently stored in a separate RocksDB instance. The new storage engine can better control the number and level of files in the RocksDB instance, achieve physical isolation of data operations between Regions, and support stably managing more data. Compared with the original TiKV storage engine, using the Partitioned Raft KV storage engine can achieve about twice the write throughput and reduce the elastic scaling time by about 4/5 under the same hardware conditions and mixed read and write scenarios.
In TiDB v7.1.0, the Partitioned Raft KV storage engine supports tools such as TiDB Lightning, BR, and TiCDC.
Currently, this feature is experimental and not recommended for use in production environments. You can only use this engine in a newly created cluster and you cannot directly upgrade from the original TiKV storage engine.
For more information, see documentation.
TiFlash supports late materialization (GA) #5829 @Lloyd-Pottiger
In v7.0.0, late materialization was introduced in TiFlash as an experimental feature for optimizing query performance. This feature is disabled by default (the
tidb_opt_enable_late_materialization
system variable defaults toOFF
). When processing aSELECT
statement with filter conditions (WHERE
clause), TiFlash reads all the data from the columns required by the query, and then filters and aggregates the data based on the query conditions. When Late materialization is enabled, TiDB supports pushing down part of the filter conditions to the TableScan operator. That is, TiFlash first scans the column data related to the filter conditions that are pushed down to the TableScan operator, filters the rows that meet the condition, and then scans the other column data of these rows for further calculation, thereby reducing IO scans and computations of data processing.Starting from v7.1.0, the TiFlash late materialization feature is generally available and enabled by default (the
tidb_opt_enable_late_materialization
system variable defaults toON
). The TiDB optimizer decides which filters to be pushed down to the TableScan operator based on the statistics and the filter conditions of the query.For more information, see documentation.
TiFlash supports automatically choosing an MPP Join algorithm according to the overhead of network transmission #7084 @solotzg
The TiFlash MPP mode supports multiple Join algorithms. Before v7.1.0, TiDB determines whether the MPP mode uses the Broadcast Hash Join algorithm based on the
tidb_broadcast_join_threshold_count
andtidb_broadcast_join_threshold_size
variables and the actual data volume.In v7.1.0, TiDB introduces the
tidb_prefer_broadcast_join_by_exchange_data_size
variable, which controls whether to choose the MPP Join algorithm based on the minimum overhead of network transmission. This variable is disabled by default, indicating that the default algorithm selection method remains the same as that before v7.1.0. You can set the variable toON
to enable it. When it is enabled, you no longer need to manually adjust thetidb_broadcast_join_threshold_count
andtidb_broadcast_join_threshold_size
variables (both variables does not take effect at this time), TiDB automatically estimates the threshold of network transmission by different Join algorithms, and then chooses the algorithm with the smallest overhead overall, thus reducing network traffic and improving MPP query performance.For more information, see documentation.
Support load-based replica read to mitigate read hotspots #14151 @sticnarf @you06
In a read hotspot scenario, the hotspot TiKV node cannot process read requests in time, resulting in the read requests queuing. However, not all TiKV resources are exhausted at this time. To reduce latency, TiDB v7.1.0 introduces the load-based replica read feature, which allows TiDB to read data from other TiKV nodes without queuing on the hotspot TiKV node. You can control the queue length of read requests using the
tidb_load_based_replica_read_threshold
system variable. When the estimated queue time of the leader node exceeds this threshold, TiDB prioritizes reading data from follower nodes. This feature can improve read throughput by 70% to 200% in a read hotspot scenario compared to not scattering read hotspots.For more information, see documentation.
Enhance the capability of caching execution plans for non-prepared statements (experimental) #36598 @qw4990
TiDB v7.0.0 introduces non-prepared plan cache as an experimental feature to improve the load capacity of concurrent OLTP. In v7.1.0, TiDB enhances this feature and supports caching more SQL statements.
To improve memory utilization, TiDB v7.1.0 merges the cache pools of non-prepared and prepared plan caches. You can control the cache size using the system variable
tidb_session_plan_cache_size
. Thetidb_prepared_plan_cache_size
andtidb_non_prepared_plan_cache_size
system variables are deprecated.To maintain forward compatibility, when you upgrade from an earlier version to v7.1.0 or later versions, the cache size
tidb_session_plan_cache_size
remains the same value astidb_prepared_plan_cache_size
, andtidb_enable_non_prepared_plan_cache
remains the setting before the upgrade. After sufficient performance testing, you can enable non-prepared plan cache usingtidb_enable_non_prepared_plan_cache
. For a newly created cluster, non-prepared plan cache is enabled by default.Non-prepared plan cache does not support DML statements by default. To remove this restriction, you can set the
tidb_enable_non_prepared_plan_cache_for_dml
system variable toON
.For more information, see documentation.
Support the TiDB Distributed eXecution Framework (DXF) (experimental) #41495 @benjamin2037
Before TiDB v7.1.0, only one TiDB node can serve as the DDL owner and execute DDL tasks at the same time. Starting from TiDB v7.1.0, in the new DXF, multiple TiDB nodes can execute the same DDL task in parallel, thus better utilizing the resources of the TiDB cluster and significantly improving the performance of DDL. In addition, you can linearly improve the performance of DDL by adding more TiDB nodes. Note that this feature is currently experimental and only supports
ADD INDEX
operations.To use the DXF, set the value of
tidb_enable_dist_task
toON
:SET GLOBAL tidb_enable_dist_task = ON;For more information, see documentation.
Reliability
Resource Control becomes generally available (GA) #38825 @nolouch @BornChanger @glorv @tiancaiamao @Connor1996 @JmPotato @hnes @CabinfeverB @HuSharp
TiDB enhances the resource control feature based on resource groups, which becomes GA in v7.1.0. This feature significantly improves the resource utilization efficiency and performance of TiDB clusters. The introduction of the resource control feature is a milestone for TiDB. You can divide a distributed database cluster into multiple logical units, map different database users to corresponding resource groups, and set the quota for each resource group as needed. When the cluster resources are limited, all resources used by sessions in the same resource group are limited to the quota. In this way, even if a resource group is over-consumed, the sessions in other resource groups are not affected.
With this feature, you can combine multiple small and medium-sized applications from different systems into a single TiDB cluster. When the workload of an application grows larger, it does not affect the normal operation of other applications. When the system workload is low, busy applications can still be allocated the required system resources even if they exceed the set quotas, which can achieve the maximum utilization of resources. In addition, the rational use of the resource control feature can reduce the number of clusters, ease the difficulty of operation and maintenance, and save management costs.
In TiDB v7.1.0, this feature introduces the ability to estimate system capacity based on actual workload or hardware deployment. The estimation ability provides you with a more accurate reference for capacity planning and assists you in better managing TiDB resource allocation to meet the stability needs of enterprise-level scenarios.
To improve user experience, TiDB Dashboard provides the Resource Manager page. You can view the resource group configuration on this page and estimate cluster capacity in a visual way to facilitate reasonable resource allocation.
For more information, see documentation.
Support the checkpoint mechanism for Fast Online DDL to improve fault tolerance and automatic recovery capability #42164 @tangenta
TiDB v7.1.0 introduces a checkpoint mechanism for Fast Online DDL, which significantly improves the fault tolerance and automatic recovery capability of Fast Online DDL. Even if the TiDB owner node is restarted or changed due to failures, TiDB can still recover progress from checkpoints that are automatically updated on a regular basis, making the DDL execution more stable and efficient.
For more information, see documentation.
Backup & Restore supports checkpoint restore #42339 @Leavrth
Snapshot restore or log restore might be interrupted due to recoverable errors, such as disk exhaustion and node crash. Before TiDB v7.1.0, the recovery progress before the interruption would be invalidated even after the error is addressed, and you need to start the restore from scratch. For large clusters, this incurs considerable extra cost.
Starting from TiDB v7.1.0, Backup & Restore (BR) introduces the checkpoint restore feature, which enables you to continue an interrupted restore. This feature can retain most recovery progress of the interrupted restore.
For more information, see documentation.
Optimize the strategy of loading statistics #42160 @xuyifangreeneyes
TiDB v7.1.0 introduces lightweight statistics initialization as an experimental feature. Lightweight statistics initialization can significantly reduce the number of statistics that must be loaded during startup, thus improving the speed of loading statistics. This feature increases the stability of TiDB in complex runtime environments and reduces the impact on the overall service when TiDB nodes restart. You can set the parameter
lite-init-stats
totrue
to enable this feature.During TiDB startup, SQL statements executed before the initial statistics are fully loaded might have suboptimal execution plans, thus causing performance issues. To avoid such issues, TiDB v7.1.0 introduces the configuration parameter
force-init-stats
. With this option, you can control whether TiDB provides services only after statistics initialization has been finished during startup. This parameter is disabled by default.For more information, see documentation.
TiCDC supports the data integrity validation feature for single-row data #8718 #42747 @3AceShowHand @zyguan
Starting from v7.1.0, TiCDC introduces the data integrity validation feature, which uses a checksum algorithm to validate the integrity of single-row data. This feature helps verify whether any error occurs in the process of writing data from TiDB, replicating it through TiCDC, and then writing it to a Kafka cluster. The data integrity validation feature only supports changefeeds that use Kafka as the downstream and currently supports the Avro protocol.
For more information, see documentation.
TiCDC optimizes DDL replication operations #8686 @hi-rustin
Before v7.1.0, when you perform a DDL operation that affects all rows on a large table (such as adding or deleting a column), the replication latency of TiCDC would significantly increase. Starting from v7.1.0, TiCDC optimizes this replication operation and mitigates the impact of DDL operations on downstream latency.
For more information, see documentation.
Improve the stability of TiDB Lightning when importing TiB-level data #43510 #43657 @D3Hunter @lance6716
Starting from v7.1.0, TiDB Lightning has added four configuration items to improve stability when importing TiB-level data.
tikv-importer.region-split-batch-size
controls the number of Regions when splitting Regions in a batch. The default value is4096
.tikv-importer.region-split-concurrency
controls the concurrency when splitting Regions. The default value is the number of CPU cores.tikv-importer.region-check-backoff-limit
controls the number of retries to wait for the Region to come online after the split and scatter operations. The default value is1800
and the maximum retry interval is two seconds. The number of retries is not increased if any Region becomes online between retries.tikv-importer.pause-pd-scheduler-scope
controls the scope in which TiDB Lightning pauses PD scheduling. Value options are"table"
and"global"
. The default value is"table"
. For TiDB versions earlier than v6.1.0, you can only configure the"global"
option, which pauses global scheduling during data import. Starting from v6.1.0, the"table"
option is supported, which means that scheduling is only paused for the Region that stores the target table data. It is recommended to set this configuration item to"global"
in scenarios with large data volumes to improve stability.
For more information, see documentation.
SQL
Support saving TiFlash query results using the
INSERT INTO SELECT
statement (GA) #37515 @gengliqiStarting from v6.5.0, TiDB supports pushing down the
SELECT
clause (analytical query) of theINSERT INTO SELECT
statement to TiFlash. In this way, you can easily save the TiFlash query result to a TiDB table specified byINSERT INTO
for further analysis, which takes effect as result caching (that is, result materialization).In v7.1.0, this feature is generally available. During the execution of the
SELECT
clause in theINSERT INTO SELECT
statement, the optimizer can intelligently decide whether to push a query down to TiFlash based on the SQL mode and the cost estimates of the TiFlash replica. Therefore, thetidb_enable_tiflash_read_for_write_stmt
system variable introduced during the experimental phase is now deprecated. Note that the computation rules ofINSERT INTO SELECT
statements for TiFlash do not meet theSTRICT SQL Mode
requirement, so TiDB allows theSELECT
clause in theINSERT INTO SELECT
statement to be pushed down to TiFlash only when the SQL mode of the current session is not strict, which means that thesql_mode
value does not containSTRICT_TRANS_TABLES
andSTRICT_ALL_TABLES
.For more information, see documentation.
MySQL-compatible multi-valued indexes become generally available (GA) #39592 @xiongjiwei @qw4990 @YangKeao
Filtering the values of an array in a JSON column is a common operation, but normal indexes cannot help speed up such an operation. Creating a multi-valued index on an array can greatly improve filtering performance. If an array in the JSON column has a multi-valued index, you can use the multi-valued index to filter retrieval conditions in
MEMBER OF()
,JSON_CONTAINS()
, andJSON_OVERLAPS()
functions, thereby reducing I/O consumption and improving operation speed.In v7.1.0, the multi-valued indexes feature becomes generally available (GA). It supports more complete data types and is compatible with TiDB tools. You can use multi-valued indexes to speed up the search operations on JSON arrays in production environments.
For more information, see documentation.
Improve the partition management for Hash and Key partitioned tables #42728 @mjonss
Before v7.1.0, Hash and Key partitioned tables in TiDB only support the
TRUNCATE PARTITION
partition management statement. Starting from v7.1.0, Hash and Key partitioned tables also supportADD PARTITION
andCOALESCE PARTITION
partition management statements. Therefore, you can flexibly adjust the number of partitions in Hash and Key partitioned tables as needed. For example, you can increase the number of partitions with theADD PARTITION
statement, or decrease the number of partitions with theCOALESCE PARTITION
statement.For more information, see documentation.
The syntax of Range INTERVAL partitioning becomes generally available (GA) #35683 @mjonss
The syntax of Range INTERVAL partitioning (introduced in v6.3.0) becomes GA. With this syntax, you can define Range partitioning by a desired interval without enumerating all partitions, which drastically reduces the length of Range partitioning DDL statements. The syntax is equivalent to that of the original Range partitioning.
For more information, see documentation.
Generated columns become generally available (GA) @bb7133
Generated columns are a valuable feature for a database. When creating a table, you can define that the value of a column is calculated based on the values of other columns in the table, rather than being explicitly inserted or updated by users. This generated column can be either a virtual column or a stored column. TiDB has supported MySQL-compatible generated columns since earlier versions, and this feature becomes GA in v7.1.0.
Using generated columns can improve MySQL compatibility for TiDB, simplifying the process of migrating from MySQL. It also reduces data maintenance complexity and improves data consistency and query efficiency.
For more information, see documentation.
DB operations
Support smooth cluster upgrade without manually canceling DDL operations (experimental) #39751 @zimulala
Before TiDB v7.1.0, to upgrade a cluster, you must manually cancel its running or queued DDL tasks before the upgrade and then add them back after the upgrade.
To provide a smoother upgrade experience, TiDB v7.1.0 supports automatically pausing and resuming DDL tasks. Starting from v7.1.0, you can upgrade your clusters without manually canceling DDL tasks in advance. TiDB will automatically pause any running or queued user DDL tasks before the upgrade and resume these tasks after the rolling upgrade, making it easier for you to upgrade your TiDB clusters.
For more information, see documentation.
Observability
Enhance optimizer diagnostic information #43122 @time-and-fate
Obtaining sufficient information is the key to SQL performance diagnostics. In v7.1.0, TiDB continues to add optimizer runtime information to various diagnostic tools, providing better insights into how execution plans are selected and assisting in troubleshooting SQL performance issues. The new information includes:
debug_trace.json
in the output ofPLAN REPLAYER
.- Partial statistics details for
operator info
in the output ofEXPLAIN
. - Partial statistics details in the
Stats
field of slow queries.
For more information, see Use
PLAN REPLAYER
to save and restore the on-site information of a cluster,EXPLAIN
walkthrough, and Identify slow queries.
Security
Replace the interface used for querying TiFlash system table information #6941 @flowbehappy
Starting from v7.1.0, when providing the query service of
INFORMATION_SCHEMA.TIFLASH_TABLES
andINFORMATION_SCHEMA.TIFLASH_SEGMENTS
system tables for TiDB, TiFlash uses the gRPC port instead of the HTTP port, which avoids the security risks of the HTTP service.Support LDAP authentication #43580 @YangKeao
Starting from v7.1.0, TiDB supports LDAP authentication and provides two authentication plugins:
authentication_ldap_sasl
andauthentication_ldap_simple
.For more information, see documentation.
Enhance the database auditing feature (Enterprise Edition)
In v7.1.0, TiDB Enterprise Edition enhances the database auditing feature, which significantly expands its capacity and improves the user experience to meet the needs of enterprises for database security compliance:
- Introduce the concepts of "Filter" and "Rule" for more granular audit event definitions and more fine-grained audit settings.
- Support defining rules in JSON format, providing a more user-friendly configuration method.
- Add automatic log rotation and space management functions, and support configuring log rotation in two dimensions: retention time and log size.
- Support outputting audit logs in both TEXT and JSON formats, facilitating easier integration with third-party tools.
- Support audit log redaction. You can replace all literals to enhance security.
Database auditing is an important feature in TiDB Enterprise Edition. This feature provides a powerful monitoring and auditing tool for enterprises to ensure data security and compliance. It can help enterprise managers in tracking the source and impact of database operations to prevent illegal data theft or tampering. Furthermore, database auditing can also help enterprises meet various regulatory and compliance requirements, ensuring legal and ethical compliance. This feature has important application value for enterprise information security.
For more information, see user guide. This feature is included in TiDB Enterprise Edition. To use this feature, navigate to the TiDB Enterprise page to get TiDB Enterprise Edition.
Compatibility changes
Behavior changes
To improve security, TiFlash deprecates the HTTP service port (default
8123
) and uses the gRPC port as a replacementIf you have upgraded TiFlash to v7.1.0, then during the TiDB upgrade to v7.1.0, TiDB cannot read the TiFlash system tables (
INFORMATION_SCHEMA.TIFLASH_TABLES
andINFORMATION_SCHEMA.TIFLASH_SEGMENTS
).TiDB Lightning in TiDB versions from v6.2.0 to v7.0.0 decides whether to pause global scheduling based on the TiDB cluster version. When TiDB cluster version >= v6.1.0, scheduling is only paused for the Region that stores the target table data and is resumed after the target table import is complete. While for other versions, TiDB Lightning pauses global scheduling. Starting from TiDB v7.1.0, you can control whether to pause global scheduling by configuring
pause-pd-scheduler-scope
. By default, TiDB Lightning pauses scheduling for the Region that stores the target table data. If the target cluster version is earlier than v6.1.0, an error occurs. In this case, you can change the value of the parameter to"global"
and try again.When you use
FLASHBACK CLUSTER TO TIMESTAMP
in TiDB v7.1.0, some Regions might remain in the FLASHBACK process even after the completion of the FLASHBACK operation. It is recommended to avoid using this feature in v7.1.0. For more information, see issue #44292. If you have encountered this issue, you can use the TiDB snapshot backup and restore feature to restore data.
System variables
Variable name | Change type | Description |
---|---|---|
tidb_enable_tiflash_read_for_write_stmt | Deprecated | Changes the default value from OFF to ON . When tidb_allow_mpp = ON , the optimizer intelligently decides whether to push a query down to TiFlash based on the SQL mode and the cost estimates of the TiFlash replica. |
tidb_non_prepared_plan_cache_size | Deprecated | Starting from v7.1.0, this system variable is deprecated. You can use tidb_session_plan_cache_size to control the maximum number of plans that can be cached. |
tidb_prepared_plan_cache_size | Deprecated | Starting from v7.1.0, this system variable is deprecated. You can use tidb_session_plan_cache_size to control the maximum number of plans that can be cached. |
tidb_ddl_distribute_reorg | Deleted | This variable is renamed to tidb_enable_dist_task . |
default_authentication_plugin | Modified | Introduces two new value options: authentication_ldap_sasl and authentication_ldap_simple . |
tidb_load_based_replica_read_threshold | Modified | Takes effect starting from v7.1.0 and controls the threshold for triggering load-based replica read. Changes the default value from "0s" to "1s" after further tests. |
tidb_opt_enable_late_materialization | Modified | Changes the default value from OFF to ON , meaning that the TiFlash late materialization feature is enabled by default. |
authentication_ldap_sasl_auth_method_name | Newly added | Specifies the authentication method name in LDAP SASL authentication. |
authentication_ldap_sasl_bind_base_dn | Newly added | Limits the search scope within the search tree in LDAP SASL authentication. If a user is created without AS ... clause, TiDB automatically searches the dn in LDAP server according to the user name. |
authentication_ldap_sasl_bind_root_dn | Newly added | Specifies the dn used to login to the LDAP server to search users in LDAP SASL authentication. |
authentication_ldap_sasl_bind_root_pwd | Newly added | Specifies the password used to login to the LDAP server to search users in LDAP SASL authentication. |
authentication_ldap_sasl_ca_path | Newly added | Specifies the absolute path of the certificate authority file for StartTLS connections in LDAP SASL authentication. |
authentication_ldap_sasl_init_pool_size | Newly added | Specifies the initial connections in the connection pool to the LDAP server in LDAP SASL authentication. |
authentication_ldap_sasl_max_pool_size | Newly added | Specifies the maximum connections in the connection pool to the LDAP server in LDAP SASL authentication. |
authentication_ldap_sasl_server_host | Newly added | Specifies the LDAP server host in LDAP SASL authentication. |
authentication_ldap_sasl_server_port | Newly added | Specifies the LDAP server TCP/IP port number in LDAP SASL authentication. |
authentication_ldap_sasl_tls | Newly added | Specifies whether connections by the plugin to the LDAP server are protected with StartTLS in LDAP SASL authentication. |
authentication_ldap_simple_auth_method_name | Newly added | Specifies the authentication method name in LDAP simple authentication. It only supports SIMPLE . |
authentication_ldap_simple_bind_base_dn | Newly added | Limits the search scope within the search tree in LDAP simple authentication. If a user is created without AS ... clause, TiDB will automatically search the dn in LDAP server according to the user name. |
authentication_ldap_simple_bind_root_dn | Newly added | Specifies the dn used to login to the LDAP server to search users in LDAP simple authentication. |
authentication_ldap_simple_bind_root_pwd | Newly added | Specifies the password used to login to the LDAP server to search users in LDAP simple authentication. |
authentication_ldap_simple_ca_path | Newly added | Specifies the absolute path of the certificate authority file for StartTLS connections in LDAP simple authentication. |
authentication_ldap_simple_init_pool_size | Newly added | Specifies the initial connections in the connection pool to the LDAP server in LDAP simple authentication. |
authentication_ldap_simple_max_pool_size | Newly added | Specifies the maximum connections in the connection pool to the LDAP server in LDAP simple authentication. |
authentication_ldap_simple_server_host | Newly added | Specifies the LDAP server host in LDAP simple authentication. |
authentication_ldap_simple_server_port | Newly added | Specifies the LDAP server TCP/IP port number in LDAP simple authentication. |
authentication_ldap_simple_tls | Newly added | Specifies whether connections by the plugin to the LDAP server are protected with StartTLS in LDAP simple authentication. |
tidb_enable_dist_task | Newly added | Controls whether to enable the Distributed eXecution Framework (DXF). After enabling the DXF, DDL, import, and other supported DXF tasks will be jointly completed by multiple TiDB nodes in the cluster. This variable was renamed from tidb_ddl_distribute_reorg . |
tidb_enable_non_prepared_plan_cache_for_dml | Newly added | Controls whether to enable the Non-prepared plan cache feature for DML statements. |
tidb_enable_row_level_checksum | Newly added | Controls whether to enable the TiCDC data integrity validation for single-row data feature. |
tidb_opt_fix_control | Newly added | This variable provides more fine-grained control over the optimizer and helps to prevent performance regression after upgrading caused by behavior changes in the optimizer. |
tidb_plan_cache_invalidation_on_fresh_stats | Newly added | Controls whether to invalidate the plan cache automatically when statistics on related tables are updated. |
tidb_plan_cache_max_plan_size | Newly added | Controls the maximum size of a plan that can be cached in prepared or non-prepared plan cache. |
tidb_prefer_broadcast_join_by_exchange_data_size | Newly added | Controls whether to use the algorithm with the minimum overhead of network transmission. If this variable is enabled, TiDB estimates the size of the data to be exchanged in the network using Broadcast Hash Join and Shuffled Hash Join respectively, and then chooses the one with the smaller size. tidb_broadcast_join_threshold_count and tidb_broadcast_join_threshold_size will not take effect after this variable is enabled. |
tidb_session_plan_cache_size | Newly added | Controls the maximum number of plans that can be cached. Prepared plan cache and non-prepared plan cache share the same cache. |
Configuration file parameters
Configuration file | Configuration parameter | Change type | Description |
---|---|---|---|
TiDB | performance.force-init-stats | Newly added | Controls whether to wait for statistics initialization to finish before providing services during TiDB startup. |
TiDB | performance.lite-init-stats | Newly added | Controls whether to use lightweight statistics initialization during TiDB startup. |
TiDB | log.timeout | Newly added | Sets the timeout for log-writing operations in TiDB. In case of a disk failure that prevents logs from being written, this configuration item can trigger the TiDB process to panic instead of hang. The default value is 0 , which means no timeout is set. |
TiKV | region-compact-min-redundant-rows | Newly added | Sets the number of redundant MVCC rows required to trigger RocksDB compaction. The default value is 50000 . |
TiKV | region-compact-redundant-rows-percent | Newly added | Sets the percentage of redundant MVCC rows required to trigger RocksDB compaction. The default value is 20 . |
TiKV | split.byte-threshold | Modified | Changes the default value from 30MiB to 100MiB when region-split-size is greater than or equal to 4 GB. |
TiKV | split.qps-threshold | Modified | Changes the default value from 3000 to 7000 when region-split-size is greater than or equal to 4 GB. |
TiKV | split.region-cpu-overload-threshold-ratio | Modified | Changes the default value from 0.25 to 0.75 when region-split-size is greater than or equal to 4 GB. |
TiKV | region-compact-check-step | Modified | Changes the default value from 100 to 5 when Partitioned Raft KV is enabled (storage.engine="partitioned-raft-kv" ). |
PD | store-limit-version | Newly added | Controls the mode of store limit. Value options are "v1" and "v2" . |
PD | schedule.enable-diagnostic | Modified | Changes the default value from false to true , meaning that the diagnostic feature of scheduler is enabled by default. |
TiFlash | http_port | Deleted | Deprecates the HTTP service port (default 8123 ). |
TiDB Lightning | tikv-importer.pause-pd-scheduler-scope | Newly added | Controls the scope in which TiDB Lightning pauses PD scheduling. The default value is "table" and value options are "global" and "table" . |
TiDB Lightning | tikv-importer.region-check-backoff-limit | Newly added | Controls the number of retries to wait for the Region to come online after the split and scatter operations. The default value is 1800 . The maximum retry interval is two seconds. The number of retries is not increased if any Region becomes online between retries. |
TiDB Lightning | tikv-importer.region-split-batch-size | Newly added | Controls the number of Regions when splitting Regions in a batch. The default value is 4096 . |
TiDB Lightning | tikv-importer.region-split-concurrency | Newly added | Controls the concurrency when splitting Regions. The default value is the number of CPU cores. |
TiCDC | insecure-skip-verify | Newly added | Controls whether the authentication algorithm is set when TLS is enabled in the scenario of replicating data to Kafka. |
TiCDC | integrity.corruption-handle-level | Newly added | Specifies the log level of the Changefeed when the checksum validation for single-row data fails. The default value is "warn" . Value options are "warn" and "error" . |
TiCDC | integrity.integrity-check-level | Newly added | Controls whether to enable the checksum validation for single-row data. The default value is "none" , which means to disable the feature. |
TiCDC | sink.only-output-updated-columns | Newly added | Controls whether to only output the updated columns. The default value is false . |
TiCDC | sink.enable-partition-separator | Modified | Changes the default value from false to true after further tests, meaning that partitions in a table are stored in separate directories by default. It is recommended that you keep the value as true to avoid the potential issue of data loss during replication of partitioned tables to storage services. |
Improvements
TiDB
- Display the number of distinct values for the corresponding column in the Cardinality column of the
SHOW INDEX
result #42227 @winoros - Use
SQL_NO_CACHE
to prevent TTL Scan queries from impacting the TiKV block cache #43206 @lcwangchao - Improve an error message related to
MAX_EXECUTION_TIME
to make it compatible with MySQL #43031 @dveeden - Support using the MergeSort operator on partitioned tables in IndexLookUp #26166 @Defined2014
- Enhance
caching_sha2_password
to make it compatible with MySQL #43576 @asjdf
- Display the number of distinct values for the corresponding column in the Cardinality column of the
TiKV
- Reduce the impact of split operations on write QPS when using partitioned Raft KV #14447 @SpadeA-Tang
- Optimize the space occupied by snapshots when using partitioned Raft KV #14581 @bufferflies
- Provide more detailed time information for each stage of processing requests in TiKV #12362 @cfzjywxk
- Use PD as metastore in log backup #13867 @YuJuncen
PD
- Add a controller that automatically adjusts the size of the store limit based on the execution details of the snapshot. To enable this controller, set
store-limit-version
tov2
(experimental). Once enabled, you do not need to manually adjust thestore limit
configuration to control the speed of scaling in or scaling out #6147 @bufferflies - Add historical load information to avoid frequent scheduling of Regions with unstable loads by the hotspot scheduler when the storage engine is raft-kv2 #6297 @bufferflies
- Add a leader health check mechanism. When the PD server where the etcd leader is located cannot be elected as the leader, PD actively switches the etcd leader to ensure that the PD leader is available #6403 @nolouch
- Add a controller that automatically adjusts the size of the store limit based on the execution details of the snapshot. To enable this controller, set
TiFlash
- Improve TiFlash performance and stability in the disaggregated storage and compute architecture #6882 @JaySon-Huang @breezewish @JinheLin
- Support optimizing query performance in Semi Join or Anti Semi Join by selecting the smaller table as the build side #7280 @yibin87
- Improve performance of data import from BR and TiDB Lightning to TiFlash with default configurations #7272 @breezewish
Tools
Backup & Restore (BR)
TiCDC
- Optimize the directory structure when DDL events occur in the scenario of replicating data to object storage #8890 @CharlesCheung96
- Optimize the method of setting GC TLS for the upstream when the TiCDC replication task fails #8403 @charleszheng44
- Support replicating data to the Kafka-on-Pulsar downstream #8892 @hi-rustin
- Support using the open-protocol protocol to only replicate the changed columns after an update occurs when replicating data to Kafka #8706 @sdojjy
- Optimize the error handling of TiCDC in the downstream failures or other scenarios #8657 @hicqu
- Add a configuration item
insecure-skip-verify
to control whether to set the authentication algorithm in the scenario of enabling TLS #8867 @hi-rustin
TiDB Lightning
- Change the severity level of the precheck item related to uneven Region distribution from
Critical
toWarn
to avoid blocking users from importing data #42836 @okJiang - Add a retry mechanism when encountering an
unknown RPC
error during data import #43291 @D3Hunter - Enhance the retry mechanism for Region jobs #43682 @lance6716
- Change the severity level of the precheck item related to uneven Region distribution from
Bug fixes
TiDB
- Fix the issue that there is no prompt about manually executing
ANALYZE TABLE
after reorganizing partitions #42183 @CbcWestwolf - Fix the issue of missing table names in the
ADMIN SHOW DDL JOBS
result when aDROP TABLE
operation is being executed #42268 @tiancaiamao - Fix the issue that
Ignore Event Per Minute
andStats Cache LRU Cost
charts might not be displayed normally in the Grafana monitoring panel #42562 @pingandb - Fix the issue that the
ORDINAL_POSITION
column returns incorrect results when querying theINFORMATION_SCHEMA.COLUMNS
table #43379 @bb7133 - Fix the case sensitivity issue in some columns of the permission table #41048 @bb7133
- Fix the issue that after a new column is added in the cache table, the value is
NULL
instead of the default value of the column #42928 @lqs - Fix the issue that CTE results are incorrect when pushing down predicates #43645 @winoros
- Fix the issue of DDL retry caused by write conflict when executing
TRUNCATE TABLE
for partitioned tables with many partitions and TiFlash replicas #42940 @mjonss - Fix the issue that there is no warning when using
SUBPARTITION
in creating partitioned tables #41198 #41200 @mjonss - Fix the incompatibility issue with MySQL when dealing with value overflow issues in generated columns #40066 @jiyfhust
- Fix the issue that
REORGANIZE PARTITION
cannot be concurrently executed with other DDL operations #42442 @bb7133 - Fix the issue that canceling the partition reorganization task in DDL might cause subsequent DDL operations to fail #42448 @lcwangchao
- Fix the issue that assertions on delete operations are incorrect under certain conditions #42426 @tiancaiamao
- Fix the issue that TiDB server cannot start due to an error in reading the cgroup information with the error message "can't read file memory.stat from cgroup v1: open /sys/memory.stat no such file or directory" #42659 @hawkingrei
- Fix the
Duplicate Key
issue that occurs when updating the partition key of a row on a partitioned table with a global index #42312 @L-maple - Fix the issue that the
Scan Worker Time By Phase
chart in the TTL monitoring panel does not display data #42515 @lcwangchao - Fix the issue that some queries on partitioned tables with a global index return incorrect results #41991 #42065 @L-maple
- Fix the issue of displaying some error logs during the process of reorganizing a partitioned table #42180 @mjonss
- Fix the issue that the data length in the
QUERY
column of theINFORMATION_SCHEMA.DDL_JOBS
table might exceed the column definition #42440 @tiancaiamao - Fix the issue that the
INFORMATION_SCHEMA.CLUSTER_HARDWARE
table might display incorrect values in containers #42851 @hawkingrei - Fix the issue that an incorrect result is returned when you query a partitioned table using
ORDER BY
+LIMIT
#43158 @Defined2014 - Fix the issue of multiple DDL tasks running simultaneously using the ingest method #42903 @tangenta
- Fix the wrong value returned when querying a partitioned table using
Limit
#24636 - Fix the issue of displaying the incorrect TiDB address in IPv6 environment #43260 @nexustar
- Fix the issue of displaying incorrect values for system variables
tidb_enable_tiflash_read_for_write_stmt
andtidb_enable_exchange_partition
#43281 @gengliqi - Fix the issue that when
tidb_scatter_region
is enabled, Region does not automatically split after a partition is truncated #43174 #43028 @jiyfhust - Add checks on the tables with generated columns and report errors for unsupported DDL operations on these columns #38988 #24321 @tiancaiamao
- Fix the issue that the error message is incorrect in certain type conversion errors #41730 @hawkingrei
- Fix the issue that after a TiDB node is normally shutdown, DDL tasks triggered on this node will be canceled #43854 @zimulala
- Fix the issue that when the PD member address changes, allocating ID for the
AUTO_INCREMENT
column will be blocked for a long time #42643 @tiancaiamao - Fix the issue of reporting the
GC lifetime is shorter than transaction duration
error during DDL execution #40074 @tangenta - Fix the issue that metadata locks unexpectedly block the DDL execution #43755 @wjhuang2016
- Fix the issue that the cluster cannot query some system views in IPv6 environment #43286 @Defined2014
- Fix the issue of not finding the partition during inner join in dynamic pruning mode #43686 @mjonss
- Fix the issue that TiDB reports syntax errors when analyzing tables #43392 @guo-shaoge
- Fix the issue that TiCDC might lose some row changes during table renaming #43338 @tangenta
- Fix the issue that TiDB server crashes when the client uses cursor reads #38116 @YangKeao
- Fix the issue that
ADMIN SHOW DDL JOBS LIMIT
returns incorrect results #42298 @CbcWestwolf - Fix the TiDB panic issue that occurs when querying union views and temporary tables with
UNION
#42563 @lcwangchao - Fix the issue that renaming tables does not take effect when committing multiple statements in a transaction #39664 @tiancaiamao
- Fix the incompatibility issue between the behavior of prepared plan cache and non-prepared plan cache during time conversion #42439 @qw4990
- Fix the wrong results caused by plan cache for Decimal type #43311 @qw4990
- Fix the TiDB panic issue in null-aware anti join (NAAJ) due to the wrong field type check #42459 @AilinKid
- Fix the issue that DML execution failures in pessimistic transactions at the RC isolation level might cause inconsistency between data and indexes #43294 @ekexium
- Fix the issue that in some extreme cases, when the first statement of a pessimistic transaction is retried, resolving locks on this transaction might affect transaction correctness #42937 @MyonKeminta
- Fix the issue that in some rare cases, residual pessimistic locks of pessimistic transactions might affect data correctness when GC resolves locks #43243 @MyonKeminta
- Fix the issue that the
LOCK
toPUT
optimization leads to duplicate data being returned in specific queries #28011 @zyguan - Fix the issue that when data is changed, the locking behavior of the unique index is not consistent with that when the data is unchanged #36438 @zyguan
- Fix the issue that there is no prompt about manually executing
TiKV
- Fix the issue that when you enable
tidb_pessimistic_txn_fair_locking
, in some extreme cases, expired requests caused by failed RPC retries might affect data correctness during the resolve lock operation #14551 @MyonKeminta - Fix the issue that when you enable
tidb_pessimistic_txn_fair_locking
, in some extreme cases, expired requests caused by failed RPC retries might cause transaction conflicts to be ignored, thus affecting transaction consistency #14311 @MyonKeminta - Fix the issue that encryption key ID conflict might cause the deletion of the old keys #14585 @tabokie
- Fix the performance degradation issue caused by accumulated lock records when a cluster is upgraded from a previous version to v6.5 or later versions #14780 @MyonKeminta
- Fix the issue that the
raft entry is too large
error occurs during the PITR recovery process #14313 @YuJuncen - Fix the issue that TiKV panics during the PITR recovery process due to
log_batch
exceeding 2 GB #13848 @YuJuncen
- Fix the issue that when you enable
PD
- Fix the issue that the number of
low space store
in the PD monitoring panel is abnormal after TiKV panics #6252 @HuSharp - Fix the issue that Region Health monitoring data is deleted after PD leader switch #6366 @iosmanthus
- Fix the issue that the rule checker cannot repair unhealthy Regions with the
schedule=deny
label #6426 @nolouch - Fix the issue that some existing labels are lost after TiKV or TiFlash restarts #6467 @JmPotato
- Fix the issue that the replication status cannot be switched when there are learner nodes in the replication mode #14704 @nolouch
- Fix the issue that the number of
TiFlash
- Fix the issue that querying data in the
TIMESTAMP
orTIME
type returns errors after enabling late materialization #7455 @Lloyd-Pottiger - Fix the issue that large update transactions might cause TiFlash to repeatedly report errors and restart #7316 @JaySon-Huang
- Fix the issue that querying data in the
Tools
Backup & Restore (BR)
TiCDC
- Fix the issue of TiCDC time zone setting #8798 @hi-rustin
- Fix the issue that TiCDC cannot automatically recover when PD address or leader fails #8812 #8877 @asddongmen
- Fix the issue that checkpoint lag increases when one of the upstream TiKV nodes crashes #8858 @hicqu
- Fix the issue that when replicating data to object storage, the
EXCHANGE PARTITION
operation in the upstream cannot be properly replicated to the downstream #8914 @CharlesCheung96 - Fix the OOM issue caused by excessive memory usage of the sorter component in some special scenarios #8974 @hicqu
- Fix the TiCDC node panic that occurs when the downstream Kafka sinks are rolling restarted #9023 @asddongmen
TiDB Data Migration (DM)
TiDB Dumpling
TiDB Binlog
TiDB Lightning
- Fix the performance degradation issue during data import #42456 @lance6716
- Fix the issue of
write to tikv with no leader returned
when importing a large amount of data #43055 @lance6716 - Fix the issue of excessive
keys within region is empty, skip doIngest
logs during data import #43197 @D3Hunter - Fix the issue that panic might occur during partial write #43363 @lance6716
- Fix the issue that OOM might occur when importing a wide table #43728 @D3Hunter
- Fix the issue of missing data in the TiDB Lightning Grafana dashboard #43357 @lichunzhu
- Fix the import failure due to incorrect setting of
keyspace-name
#43684 @zeminzhou - Fix the issue that data import might be skipped during range partial write in some cases #43768 @lance6716
Performance test
To learn about the performance of TiDB v7.1.0, you can refer to the TPC-C performance test report and Sysbench performance test report of the TiDB Cloud Dedicated cluster.
Contributors
We would like to thank the following contributors from the TiDB community:
- blacktear23
- ethercflow
- hihihuhu
- jiyfhust
- L-maple
- lqs
- pingandb
- yorkhellen
- yujiarista (First-time contributor)