Use an HTAP Cluster

HTAP means Hybrid Transactional/Analytical Processing. The HTAP cluster in TiDB Cloud is composed of TiKV, a row-based storage engine designed for transactional processing, and TiFlashbeta, a columnar storage designed for analytical processing. Your application data is first stored in TiKV and then replicated to TiFlashbeta via the Raft consensus algorithm. So it is real time replication from the row store to the columnar store.

With TiDB Cloud, you can create an HTAP cluster easily by specifying one or more TiFlashbeta nodes according to your HTAP workload. If the TiFlashbeta node count is not specified when you create the cluster or you want to add more TiFlashbeta nodes, you can change the node count by scaling the cluster.

Note

A Developer Tier cluster has one TiFlashbeta node by default and you cannot change the number.

TiKV data is not replicated to TiFlashbeta by default. You can select which table to replicate to TiFlashbeta using the following SQL statement:

ALTER TABLE table_name SET TIFLASH REPLICA 1;

The number of replicas count must be smaller than the number of TiFlashbeta nodes. Setting the number of replicas to 0 means deleting the replica in TiFlashbeta.

To check the replication progress, use the following command:

SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = '<db_name>' and TABLE_NAME = '<table_name>';

Use TiDB to read TiFlashbeta replicas

After data is replicated to TiFlashbeta, you can use one of the following three ways to read TiFlashbeta replicas to accelerate your analytical computing.

Smart selection

For tables with TiFlashbeta replicas, the TiDB optimizer automatically determines whether to use TiFlashbeta replicas based on the cost estimation. For example:

explain analyze select count(*) from test.t;
+--------------------------+---------+---------+--------------+---------------+----------------------------------------------------------------------+--------------------------------+-----------+------+
| id                       | estRows | actRows | task         | access object | execution info                                                       | operator info                  | memory    | disk |
+--------------------------+---------+---------+--------------+---------------+----------------------------------------------------------------------+--------------------------------+-----------+------+
| StreamAgg_9              | 1.00    | 1       | root         |               | time:83.8372ms, loops:2                                              | funcs:count(1)->Column#4       | 372 Bytes | N/A  |
| └─TableReader_17         | 1.00    | 1       | root         |               | time:83.7776ms, loops:2, rpc num: 1, rpc time:83.5701ms, proc keys:0 | data:TableFullScan_16          | 152 Bytes | N/A  |
|   └─TableFullScan_16     | 1.00    | 1       | cop[tiflash] | table:t       | time:43ms, loops:1                                                   | keep order:false, stats:pseudo | N/A       | N/A  |
+--------------------------+---------+---------+--------------+---------------+----------------------------------------------------------------------+--------------------------------+-----------+------+

cop[tiflash] means that the task will be sent to TiFlashbeta for processing. If your queries have not selected a TiFlashbeta replica, try to update the statistics using the analyze table statement, and then check the result using the explain analyze statement.

Engine isolation

Engine isolation is to specify that all queries use a replica of the specified engine by configuring the tidb_isolation_read_engines variable. The optional engines are "tikv", "tidb" (indicates the internal memory table area of TiDB, which stores some TiDB system tables and cannot be actively used by users), and "tiflash".

set @@session.tidb_isolation_read_engines = "engine list separated by commas";

Manual hint

Manual hint can force TiDB to use specified replicas for one or more specific tables on the premise of satisfying engine isolation. Here is an example of using the manual hint:

select /*+ read_from_storage(tiflash[table_name]) */ ... from table_name;

To learn more about TiFlashbeta, refer to the documentation here.

Was this page helpful?