Sign InTry Free

TiFlash Alert Rules

This document introduces the alert rules of the TiFlash cluster.

TiFlash_schema_error

  • Alert rule:

    increase(tiflash_schema_apply_count{type="failed"}[15m]) > 0

  • Description:

    When the schema apply error occurs, an alert is triggered.

  • Solution:

    The error might be caused by some wrong logic. Get support from PingCAP or the community.

TiFlash_schema_apply_duration

  • Alert rule:

    histogram_quantile(0.99, sum(rate(tiflash_schema_apply_duration_seconds_bucket[1m])) BY (le, instance)) > 20

  • Description:

    When the probability that the apply duration exceeds 20 seconds is over 99%, an alert is triggered.

  • Solution:

    It might be caused by the internal problems of the TiFlash storage engine. Get support from PingCAP or the community.

TiFlash_raft_read_index_duration

  • Alert rule:

    histogram_quantile(0.99, sum(rate(tiflash_raft_read_index_duration_seconds_bucket[1m])) BY (le, instance)) > 3

  • Description:

    When the probability that the read index duration exceeds 3 seconds is over 99%, an alert is triggered.

  • Solution:

    The frequent retries might be caused by frequent splitting or migration of the TiKV cluster. You can check the TiKV cluster status to identify the retry reason.

TiFlash_raft_wait_index_duration

  • Alert rule:

    histogram_quantile(0.99, sum(rate(tiflash_raft_wait_index_duration_seconds_bucket[1m])) BY (le, instance)) > 2

  • Description:

    When the probability that the waiting time for Region Raft Index in TiFlash exceeds 2 seconds is over 99%, an alert is triggered.

  • Solution:

    It might be caused by a communication error between TiKV and the proxy. Get support from PingCAP or the community.

Download PDFRequest docs changesAsk questions on TiDB ForumEdit this page
Was this page helpful?
Products
TiDB Cloud
TiDB
Pricing
Get Demo
© 2023 PingCAP. All Rights Reserved.