Important

You are viewing the archived documentation of TiDB, which no longer receives updates. It is recommended that you use the latest LTS version of the TiDB database.

TiSpark Deployment Topology

Warning

TiSpark support in the TiUP cluster is still an experimental feature. It is NOT recommended to use it in the production environment.

This document introduces the TiSpark deployment topology and how to deploy TiSpark based on the minimum cluster topology.

TiSpark is a component built for running Apache Spark on top of TiDB/TiKV to answer complex OLAP queries. It brings benefits of both the Spark platform and the distributed TiKV cluster to TiDB and makes TiDB a one-stop solution for both online transactions and analytics.

For more information about the TiSpark architecture and how to use it, see TiSpark User Guide.

Topology information

Instance	Count	Physical machine configuration	IP	Configuration
TiDB	3	16 VCore 32GB * 1	10.0.1.1 10.0.1.2 10.0.1.3	Default port Global directory configuration
PD	3	4 VCore 8GB * 1	10.0.1.4 10.0.1.5 10.0.1.6	Default port Global directory configuration
TiKV	3	16 VCore 32GB 2TB (nvme ssd) * 1	10.0.1.7 10.0.1.8 10.0.1.9	Default port Global directory configuration
TiSpark	3	8 VCore 16GB * 1	10.0.1.21 (master) 10.0.1.22 (worker) 10.0.1.23 (worker)	Default port Global directory configuration
Monitoring & Grafana	1	4 VCore 8GB * 1 500GB (ssd)	10.0.1.11	Default port Global directory configuration

Topology templates

For detailed descriptions of the configuration items in the above TiDB cluster topology file, see Topology Configuration File for Deploying TiDB Using TiUP.

Note

You do not need to manually create the tidb user in the configuration file. The TiUP cluster component automatically creates the tidb user on the target machines. You can customize the user, or keep the user consistent with the control machine.
If you configure the deployment directory as a relative path, the cluster will be deployed in the home directory of the user.

Prerequisites

TiSpark is based on the Apache Spark cluster, so before you start the TiDB cluster that contains TiSpark, you must ensure that Java Runtime Environment (JRE) 8 is installed on the server that deploys TiSpark. Otherwise, TiSpark cannot be started.

TiUP does not support installing JRE automatically. You need to install it on your own. For detailed installation instruction, see How to download and install prebuilt OpenJDK packages.

If JRE 8 has already been installed on the deployment server but is not in the path of the system's default package management tool, you can specify the path of the JRE environment to be used by setting the java_home parameter in the topology configuration. This parameter corresponds to the JAVA_HOME system environment variable.