- 关于 TiDB
- 快速上手
- 部署标准集群
- 数据迁移
- 运维操作
- 监控与告警
- 故障诊断
- 性能调优
- 系统调优
- 软件调优
- SQL 性能调优
- SQL 性能调优概览
- 理解 TiDB 执行计划
- SQL 优化流程
- 控制执行计划
- 教程
- TiDB 工具
- 功能概览
- 适用场景
- 工具下载
- TiUP
- TiDB Operator
- Dumpling
- TiDB Lightning
- TiDB Data Migration
- Backup & Restore (BR)
- TiDB Binlog
- TiCDC
- sync-diff-inspector
- TiSpark
- 参考指南
- 架构
- 监控指标
- 安全加固
- 权限
- SQL
- SQL 语言结构和语法
- SQL 语句
ADD COLUMN
ADD INDEX
ADMIN
ADMIN CANCEL DDL
ADMIN CHECKSUM TABLE
ADMIN CHECK [TABLE|INDEX]
ADMIN SHOW DDL [JOBS|QUERIES]
ALTER DATABASE
ALTER INDEX
ALTER INSTANCE
ALTER TABLE
ALTER USER
ANALYZE TABLE
BACKUP
BEGIN
CHANGE COLUMN
CHANGE DRAINER
CHANGE PUMP
COMMIT
CREATE [GLOBAL|SESSION] BINDING
CREATE DATABASE
CREATE INDEX
CREATE ROLE
CREATE SEQUENCE
CREATE TABLE LIKE
CREATE TABLE
CREATE USER
CREATE VIEW
DEALLOCATE
DELETE
DESC
DESCRIBE
DO
DROP [GLOBAL|SESSION] BINDING
DROP COLUMN
DROP DATABASE
DROP INDEX
DROP ROLE
DROP SEQUENCE
DROP STATS
DROP TABLE
DROP USER
DROP VIEW
EXECUTE
EXPLAIN ANALYZE
EXPLAIN
FLASHBACK TABLE
FLUSH PRIVILEGES
FLUSH STATUS
FLUSH TABLES
GRANT <privileges>
GRANT <role>
INSERT
KILL [TIDB]
LOAD DATA
LOAD STATS
MODIFY COLUMN
PREPARE
RECOVER TABLE
RENAME INDEX
RENAME TABLE
REPLACE
RESTORE
REVOKE <privileges>
REVOKE <role>
ROLLBACK
SELECT
SET DEFAULT ROLE
SET [NAMES|CHARACTER SET]
SET PASSWORD
SET ROLE
SET TRANSACTION
SET [GLOBAL|SESSION] <variable>
SHOW [BACKUPS|RESTORES]
SHOW ANALYZE STATUS
SHOW [GLOBAL|SESSION] BINDINGS
SHOW BUILTINS
SHOW CHARACTER SET
SHOW COLLATION
SHOW [FULL] COLUMNS FROM
SHOW CONFIG
SHOW CREATE SEQUENCE
SHOW CREATE TABLE
SHOW CREATE USER
SHOW DATABASES
SHOW DRAINER STATUS
SHOW ENGINES
SHOW ERRORS
SHOW [FULL] FIELDS FROM
SHOW GRANTS
SHOW INDEX [FROM|IN]
SHOW INDEXES [FROM|IN]
SHOW KEYS [FROM|IN]
SHOW MASTER STATUS
SHOW PLUGINS
SHOW PRIVILEGES
SHOW [FULL] PROCESSSLIST
SHOW PROFILES
SHOW PUMP STATUS
SHOW SCHEMAS
SHOW STATS_HEALTHY
SHOW STATS_HISTOGRAMS
SHOW STATS_META
SHOW STATUS
SHOW TABLE NEXT_ROW_ID
SHOW TABLE REGIONS
SHOW TABLE STATUS
SHOW [FULL] TABLES
SHOW [GLOBAL|SESSION] VARIABLES
SHOW WARNINGS
SHUTDOWN
SPLIT REGION
START TRANSACTION
TABLE
TRACE
TRUNCATE
UPDATE
USE
WITH
- 数据类型
- 函数与操作符
- 聚簇索引
- 约束
- 生成列
- SQL 模式
- 事务
- 垃圾回收 (GC)
- 视图
- 分区表
- 字符集和排序规则
- 系统表
mysql
- INFORMATION_SCHEMA
- Overview
ANALYZE_STATUS
CLIENT_ERRORS_SUMMARY_BY_HOST
CLIENT_ERRORS_SUMMARY_BY_USER
CLIENT_ERRORS_SUMMARY_GLOBAL
CHARACTER_SETS
CLUSTER_CONFIG
CLUSTER_HARDWARE
CLUSTER_INFO
CLUSTER_LOAD
CLUSTER_LOG
CLUSTER_SYSTEMINFO
COLLATIONS
COLLATION_CHARACTER_SET_APPLICABILITY
COLUMNS
DATA_LOCK_WAITS
DDL_JOBS
DEADLOCKS
ENGINES
INSPECTION_RESULT
INSPECTION_RULES
INSPECTION_SUMMARY
KEY_COLUMN_USAGE
METRICS_SUMMARY
METRICS_TABLES
PARTITIONS
PROCESSLIST
SCHEMATA
SEQUENCES
SESSION_VARIABLES
SLOW_QUERY
STATISTICS
TABLES
TABLE_CONSTRAINTS
TABLE_STORAGE_STATS
TIDB_HOT_REGIONS
TIDB_INDEXES
TIDB_SERVERS_INFO
TIDB_TRX
TIFLASH_REPLICA
TIKV_REGION_PEERS
TIKV_REGION_STATUS
TIKV_STORE_STATUS
USER_PRIVILEGES
VIEWS
METRICS_SCHEMA
- UI
- CLI
- 命令行参数
- 配置文件参数
- 系统变量
- 存储引擎
- 遥测
- 错误码
- 通过拓扑 label 进行副本调度
- 常见问题解答 (FAQ)
- 版本发布历史
- 术语表
使用 EXPLAIN
解读执行计划
SQL 是一种声明性语言,因此用户无法根据 SQL 语句直接判断一条查询的执行是否有效率。用户首先要使用 EXPLAIN
语句查看当前的执行计划。
以 bikeshare 数据库示例(英文) 中的一个 SQL 语句为例,该语句统计了 2017 年 7 月 1 日的行程次数:
EXPLAIN SELECT count(*) FROM trips WHERE start_date BETWEEN '2017-07-01 00:00:00' AND '2017-07-01 23:59:59';
+------------------------------+----------+-----------+---------------+------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+------------------------------+----------+-----------+---------------+------------------------------------------------------------------------------------------------------------------------+
| StreamAgg_20 | 1.00 | root | | funcs:count(Column#13)->Column#11 |
| └─TableReader_21 | 1.00 | root | | data:StreamAgg_9 |
| └─StreamAgg_9 | 1.00 | cop[tikv] | | funcs:count(1)->Column#13 |
| └─Selection_19 | 250.00 | cop[tikv] | | ge(bikeshare.trips.start_date, 2017-07-01 00:00:00.000000), le(bikeshare.trips.start_date, 2017-07-01 23:59:59.000000) |
| └─TableFullScan_18 | 10000.00 | cop[tikv] | table:trips | keep order:false, stats:pseudo |
+------------------------------+----------+-----------+---------------+------------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)
以上是该查询的执行计划结果。从 └─TableFullScan_18
算子开始向上看,查询的执行过程如下(非最佳执行计划):
Coprocessor (TiKV) 读取整张
trips
表的数据,作为一次TableFullScan
操作,再将读取到的数据传递给Selection_19
算子。Selection_19
算子仍在 TiKV 内。Selection_19
算子根据谓词WHERE start_date BETWEEN ..
进行数据过滤。预计大约有 250 行数据满足该过滤条件(基于统计信息以及算子的执行逻辑估算而来)。└─TableFullScan_18
算子显示stats:pseudo
,表示该表没有实际统计信息,执行ANALYZE TABLE trips
收集统计信息后,预计的估算的数字会更加准确。COUNT
函数随后应用于满足过滤条件的行,这一过程也是在 TiKV (cop[tikv]
) 中的StreamAgg_9
算子内完成的。TiKV coprocessor 能执行一些 MySQL 内置函数,COUNT
是其中之一。StreamAgg_9
算子执行的结果会被传递给TableReader_21
算子(位于 TiDB 进程中,即root
任务)。执行计划中,TableReader_21
算子的estRows
为1
,表示该算子将从每个访问的 TiKV Region 接收一行数据。这一请求过程的详情,可参阅EXPLAIN ANALYZE
。StreamAgg_20
算子随后对└─TableReader_21
算子传来的每行数据计算COUNT
函数的结果。StreamAgg_20
是根算子,会将结果返回给客户端。
要查看 TiDB 中某张表的 Region 信息,可执行 SHOW TABLE REGIONS
语句。
评估当前的性能
EXPLAIN
语句只返回查询的执行计划,并不执行该查询。若要获取实际的执行时间,可执行该查询,或使用 EXPLAIN ANALYZE
语句:
EXPLAIN ANALYZE SELECT count(*) FROM trips WHERE start_date BETWEEN '2017-07-01 00:00:00' AND '2017-07-01 23:59:59';
+------------------------------+----------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-----------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+------------------------------+----------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-----------+------+
| StreamAgg_20 | 1.00 | 1 | root | | time:1.031417203s, loops:2 | funcs:count(Column#13)->Column#11 | 632 Bytes | N/A |
| └─TableReader_21 | 1.00 | 56 | root | | time:1.031408123s, loops:2, cop_task: {num: 56, max: 782.147269ms, min: 5.759953ms, avg: 252.005927ms, p95: 609.294603ms, max_proc_keys: 910371, p95_proc_keys: 704775, tot_proc: 11.524s, tot_wait: 580ms, rpc_num: 56, rpc_time: 14.111932641s} | data:StreamAgg_9 | 328 Bytes | N/A |
| └─StreamAgg_9 | 1.00 | 56 | cop[tikv] | | proc max:640ms, min:8ms, p80:276ms, p95:480ms, iters:18695, tasks:56 | funcs:count(1)->Column#13 | N/A | N/A |
| └─Selection_19 | 250.00 | 11409 | cop[tikv] | | proc max:640ms, min:8ms, p80:276ms, p95:476ms, iters:18695, tasks:56 | ge(bikeshare.trips.start_date, 2017-07-01 00:00:00.000000), le(bikeshare.trips.start_date, 2017-07-01 23:59:59.000000) | N/A | N/A |
| └─TableFullScan_18 | 10000.00 | 19117643 | cop[tikv] | table:trips | proc max:612ms, min:8ms, p80:248ms, p95:460ms, iters:18695, tasks:56 | keep order:false, stats:pseudo | N/A | N/A |
+------------------------------+----------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-----------+------+
5 rows in set (1.03 sec)
执行以上示例查询耗时 1.03
秒,说明执行性能较为理想。
以上 EXPLAIN ANALYZE
的结果中,actRows
表明一些 estRows
预估数不准确(预估返回 10000 行数据但实际返回 19117643 行)。└─TableFullScan_18
算子的 operator info
列 (stats:pseudo
) 信息也表明该算子的预估数不准确。
如果先执行 ANALYZE TABLE
再执行 EXPLAIN ANALYZE
,预估数与实际数会更接近:
ANALYZE TABLE trips;
EXPLAIN ANALYZE SELECT count(*) FROM trips WHERE start_date BETWEEN '2017-07-01 00:00:00' AND '2017-07-01 23:59:59';
Query OK, 0 rows affected (10.22 sec)
+------------------------------+-------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-----------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+------------------------------+-------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-----------+------+
| StreamAgg_20 | 1.00 | 1 | root | | time:926.393612ms, loops:2 | funcs:count(Column#13)->Column#11 | 632 Bytes | N/A |
| └─TableReader_21 | 1.00 | 56 | root | | time:926.384792ms, loops:2, cop_task: {num: 56, max: 850.94424ms, min: 6.042079ms, avg: 234.987725ms, p95: 495.474806ms, max_proc_keys: 910371, p95_proc_keys: 704775, tot_proc: 10.656s, tot_wait: 904ms, rpc_num: 56, rpc_time: 13.158911952s} | data:StreamAgg_9 | 328 Bytes | N/A |
| └─StreamAgg_9 | 1.00 | 56 | cop[tikv] | | proc max:592ms, min:4ms, p80:244ms, p95:480ms, iters:18695, tasks:56 | funcs:count(1)->Column#13 | N/A | N/A |
| └─Selection_19 | 432.89 | 11409 | cop[tikv] | | proc max:592ms, min:4ms, p80:244ms, p95:480ms, iters:18695, tasks:56 | ge(bikeshare.trips.start_date, 2017-07-01 00:00:00.000000), le(bikeshare.trips.start_date, 2017-07-01 23:59:59.000000) | N/A | N/A |
| └─TableFullScan_18 | 19117643.00 | 19117643 | cop[tikv] | table:trips | proc max:564ms, min:4ms, p80:228ms, p95:456ms, iters:18695, tasks:56 | keep order:false | N/A | N/A |
+------------------------------+-------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-----------+------+
5 rows in set (0.93 sec)
执行 ANALYZE TABLE
后,可以看到 └─TableFullScan_18
算子的预估行数是准确的,└─Selection_19
算子的预估行数也更接近实际行数。以上两个示例中的执行计划(即 TiDB 执行查询所使用的一组算子)未改变,但过时的统计信息常常导致 TiDB 选择到非最优的执行计划。
除 ANALYZE TABLE
外,达到 tidb_auto_analyze_ratio
阈值后,TiDB 会自动在后台重新生成统计数据。若要查看 TiDB 有多接近该阈值(即 TiDB 判断统计数据有多健康),可执行 SHOW STATS_HEALTHY
语句。
SHOW STATS_HEALTHY;
+-----------+------------+----------------+---------+
| Db_name | Table_name | Partition_name | Healthy |
+-----------+------------+----------------+---------+
| bikeshare | trips | | 100 |
+-----------+------------+----------------+---------+
1 row in set (0.00 sec)
确定优化方案
当前执行计划是有效率的:
大部分任务是在 TiKV 内处理的,需要通过网络传输给 TiDB 处理的仅有 56 行数据,每行都满足过滤条件,而且都很短。
在 TiDB (
StreamAgg_20
) 中和在 TiKV (└─StreamAgg_9
) 中汇总行数都使用了 Stream Aggregate,该算法在内存使用方面很有效率。
当前执行计划存在的最大问题在于谓词 start_date BETWEEN '2017-07-01 00:00:00' AND '2017-07-01 23:59:59'
并未立即生效,先是 TableFullScan
算子读取所有行数据,然后才进行过滤选择。可以在 SHOW CREATE TABLE trips
的返回结果中找出问题原因:
SHOW CREATE TABLE trips\G
*************************** 1. row ***************************
Table: trips
Create Table: CREATE TABLE `trips` (
`trip_id` bigint(20) NOT NULL AUTO_INCREMENT,
`duration` int(11) NOT NULL,
`start_date` datetime DEFAULT NULL,
`end_date` datetime DEFAULT NULL,
`start_station_number` int(11) DEFAULT NULL,
`start_station` varchar(255) DEFAULT NULL,
`end_station_number` int(11) DEFAULT NULL,
`end_station` varchar(255) DEFAULT NULL,
`bike_number` varchar(255) DEFAULT NULL,
`member_type` varchar(255) DEFAULT NULL,
PRIMARY KEY (`trip_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin AUTO_INCREMENT=20477318
1 row in set (0.00 sec)
以上返回结果显示,start_date
列没有索引。要将该谓词下推到 index reader 算子,还需要一个索引。添加索引如下:
ALTER TABLE trips ADD INDEX (start_date);
Query OK, 0 rows affected (2 min 10.23 sec)
你可通过执行 ADMIN SHOW DDL JOBS
语句来查看 DDL 任务的进度。TiDB 中的默认值的设置较为保守,因此添加索引不会对生产环境下的负载造成太大影响。测试环境下,可以考虑调大 tidb_ddl_reorg_batch_size
和 tidb_ddl_reorg_worker_cnt
的值。在参照系统上,将批处理大小设为 10240
,将 worker count 并发度设置为 32
,该系统可获得 10 倍的性能提升(较之使用默认值)。
添加索引后,可以使用 EXPLAIN
重复该查询。在以下返回结果中,可见 TiDB 选择了新的执行计划,而且不再使用 TableFullScan
和 Selection
算子。
EXPLAIN SELECT count(*) FROM trips WHERE start_date BETWEEN '2017-07-01 00:00:00' AND '2017-07-01 23:59:59';
+-----------------------------+---------+-----------+-------------------------------------------+-------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------+---------+-----------+-------------------------------------------+-------------------------------------------------------------------+
| StreamAgg_17 | 1.00 | root | | funcs:count(Column#13)->Column#11 |
| └─IndexReader_18 | 1.00 | root | | index:StreamAgg_9 |
| └─StreamAgg_9 | 1.00 | cop[tikv] | | funcs:count(1)->Column#13 |
| └─IndexRangeScan_16 | 8471.88 | cop[tikv] | table:trips, index:start_date(start_date) | range:[2017-07-01 00:00:00,2017-07-01 23:59:59], keep order:false |
+-----------------------------+---------+-----------+-------------------------------------------+-------------------------------------------------------------------+
4 rows in set (0.00 sec)
若要比较实际的执行时间,可再次使用 EXPLAIN ANALYZE
语句:
EXPLAIN ANALYZE SELECT count(*) FROM trips WHERE start_date BETWEEN '2017-07-01 00:00:00' AND '2017-07-01 23:59:59';
+-----------------------------+---------+---------+-----------+-------------------------------------------+------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------+-----------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+-----------------------------+---------+---------+-----------+-------------------------------------------+------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------+-----------+------+
| StreamAgg_17 | 1.00 | 1 | root | | time:4.516728ms, loops:2 | funcs:count(Column#13)->Column#11 | 372 Bytes | N/A |
| └─IndexReader_18 | 1.00 | 1 | root | | time:4.514278ms, loops:2, cop_task: {num: 1, max:4.462288ms, proc_keys: 11409, rpc_num: 1, rpc_time: 4.457148ms} | index:StreamAgg_9 | 238 Bytes | N/A |
| └─StreamAgg_9 | 1.00 | 1 | cop[tikv] | | time:4ms, loops:12 | funcs:count(1)->Column#13 | N/A | N/A |
| └─IndexRangeScan_16 | 8471.88 | 11409 | cop[tikv] | table:trips, index:start_date(start_date) | time:4ms, loops:12 | range:[2017-07-01 00:00:00,2017-07-01 23:59:59], keep order:false | N/A | N/A |
+-----------------------------+---------+---------+-----------+-------------------------------------------+------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------+-----------+------+
4 rows in set (0.00 sec)
从以上结果可看出,查询时间已从 1.03 秒减少到 0.0 秒。
以上示例另一个可用的优化方案是 coprocessor cache。如果你无法添加索引,可考虑开启 coprocessor cache 功能。开启后,只要算子上次执行以来 Region 未作更改,TiKV 将从缓存中返回值。这也有助于减少 TableFullScan
和 Selection
算子的大部分运算成本。