INSPECTION_SUMMARY
In some scenarios, you might need to pay attention only to the monitoring summary of specific links or modules. For example, the number of threads for Coprocessor in the thread pool is configured as 8. If the CPU usage of Coprocessor reaches 750%, you can determine that a risk exists and Coprocessor might become a bottleneck in advance. However, some monitoring metrics vary greatly due to different user workloads, so it is difficult to define specific thresholds. It is important to troubleshoot issues in this scenario, so TiDB provides the inspection_summary
table for link summary.
The structure of the information_schema.inspection_summary
inspection summary table is as follows:
USE information_schema;
DESC inspection_summary;
+--------------+--------------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+------+---------+-------+
| RULE | varchar(64) | YES | | NULL | |
| INSTANCE | varchar(64) | YES | | NULL | |
| METRICS_NAME | varchar(64) | YES | | NULL | |
| LABEL | varchar(64) | YES | | NULL | |
| QUANTILE | double | YES | | NULL | |
| AVG_VALUE | double(22,6) | YES | | NULL | |
| MIN_VALUE | double(22,6) | YES | | NULL | |
| MAX_VALUE | double(22,6) | YES | | NULL | |
| COMMENT | varchar(256) | YES | | NULL | |
+--------------+--------------+------+------+---------+-------+
9 rows in set (0.00 sec)
Field description:
RULE
: Summary rules. Because new rules are being added continuously, you can execute theselect * from inspection_rules where type='summary'
statement to query the latest rule list.INSTANCE
: The monitored instance.METRICS_NAME
: The monitoring metrics name.QUANTILE
: Takes effect on monitoring tables that containQUANTILE
. You can specify multiple percentiles by pushing down predicates. For example, you can executeselect * from inspection_summary where rule='ddl' and quantile in (0.80, 0.90, 0.99, 0.999)
to summarize the DDL-related monitoring metrics and query the P80/P90/P99/P999 results.AVG_VALUE
,MIN_VALUE
, andMAX_VALUE
respectively indicate the average value, minimum value, and maximum value of the aggregation.COMMENT
: The comment about the corresponding monitoring metric.
Usage example:
Both the diagnostic result table and the diagnostic monitoring summary table can specify the diagnostic time range using hint
. select /*+ time_range('2020-03-07 12:00:00','2020-03-07 13:00:00') */* from inspection_summary
is the monitoring summary for the 2020-03-07 12:00:00
to 2020-03-07 13:00:00
period. Like the monitoring summary table, you can use the inspection_summary
table to quickly find the monitoring items with large differences by comparing the data of two different periods.
The following example compares the monitoring metrics of read links in two time periods:
(2020-01-16 16:00:54.933, 2020-01-16 16:10:54.933)
(2020-01-16 16:10:54.933, 2020-01-16 16:20:54.933)
SELECT
t1.avg_value / t2.avg_value AS ratio,
t1.*,
t2.*
FROM
(
SELECT
/*+ time_range("2020-01-16 16:00:54.933", "2020-01-16 16:10:54.933")*/ *
FROM information_schema.inspection_summary WHERE rule='read-link'
) t1
JOIN
(
SELECT
/*+ time_range("2020-01-16 16:10:54.933", "2020-01-16 16:20:54.933")*/ *
FROM information_schema.inspection_summary WHERE rule='read-link'
) t2
ON t1.metrics_name = t2.metrics_name
and t1.instance = t2.instance
and t1.label = t2.label
ORDER BY
ratio DESC;