TPC-H Benchmark: TiDB Cloud Lake vs. Snowflake
Quick Overview
TPC-H
The TPC-H benchmark is a standard for evaluating decision support systems, focusing on complex queries and data maintenance. This analysis compares TiDB Cloud Lake with Snowflake using the TPC-H SF100 (SF1 = 6 million rows) dataset, encompassing 100GB of data and approximately 600 million rows across 22 queries.
Snowflake and TiDB Cloud Lake
Snowflake: Snowflake is renowned for its advanced features such as separating storage and compute, scalable computing on demand, data sharing, and cloning capabilities.
TiDB Cloud Lake: TiDB Cloud Lake offers similar functionalities to Snowflake, being a cloud-native data warehouse that also separates storage from computing and provides scalable computing as needed.
It positions itself as a modern, cost-effective alternative to Snowflake, especially for large-scale analytics.
Performance and Cost Comparison
- Data Loading Costs: TiDB Cloud Lake achieves a 48% cost reduction in data loading compared to Snowflake.
- Query Execution Costs: TiDB Cloud Lake is approximately 35% less expensive for query execution than Snowflake (cold run; ~27% on hot run).
Data Loading Benchmark
Query Benchmark: Cold Run
Query Benchmark: Hot Run
Reproduce the Benchmark
You can reproduce the benchmark by following the steps below.
Benchmark Environment
The benchmark tests both Snowflake and TiDB Cloud Lake under similar conditions:
- The TPC-H SF100 dataset, sourced from Amazon Redshift, was loaded into both TiDB Cloud Lake and Snowflake without any specific tuning.
Benchmark Methodology
The benchmark includes both Cold and Hot runs for query execution:
- Cold Run: The data warehouse was suspended and resumed before executing the queries.
- Hot Run: The data warehouse is not suspended, local disk cache is used.
Prerequisites
- Have a Snowflake account
- Create a TiDB Cloud Lake account
Data Loading
Snowflake Data Load:
- Log into your Snowflake account.
- Create tables corresponding to the TPC-H schema. SQL Script.
- Use the
COPY INTOcommand to load the data from AWS S3. SQL Script.
TiDB Cloud Lake Data Load:
- Sign in to your TiDB Cloud Lake account.
- Create the necessary tables as per the TPC-H schema. SQL Script.
- Utilize a similar method to Snowflake for loading data from AWS S3. SQL Script.
TPC-H Queries
Snowflake Queries:
- Log into your Snowflake account.
- Run the TPC-H queries. SQL Script.
TiDB Cloud Lake Queries:
- Sign in to your TiDB Cloud Lake account.
- Run the TPC-H queries. SQL Script.


