Deploy TiDB on Google Cloud GKE

This document describes how to deploy a Google Kubernetes Engine (GKE) cluster and deploy a TiDB cluster on GKE.

To deploy TiDB Operator and the TiDB cluster in a self-managed Kubernetes environment, refer to Deploy TiDB Operator and Deploy TiDB on General Kubernetes.

Prerequisites

Before deploying a TiDB cluster on GKE, make sure the following requirements are satisfied:

Install Helm 3: used for deploying TiDB Operator.
Install gcloud: a command-line tool used for creating and managing Google Cloud services.
Complete the operations in the Before you begin section of GKE Quickstart.
This guide includes the following contents:
- Enable Kubernetes APIs
- Configure enough quota

Recommended instance types and storage

Instance types: to gain better performance, the following is recommended:
- PD nodes: n2-standard-4
- TiDB nodes: n2-standard-16
- TiKV or TiFlash nodes: n2-standard-16
Storage: For TiKV or TiFlash, it is recommended to use pd-ssd disk type.

Configure the Google Cloud service

Configure your Google Cloud project and default region:

gcloud config set core/project <google-cloud-project>
gcloud config set compute/region <google-cloud-region>

Create a GKE cluster and node pool

Create a GKE cluster and a default node pool:
```
gcloud container clusters create tidb --region us-east1 --machine-type n1-standard-4 --num-nodes=1
```
- The command above creates a regional cluster.
- The --num-nodes=1 option indicates that one node is created in each zone. So if there are three zones in the region, there are three nodes in total, which ensures high availability.
- It is recommended to use regional clusters in production environments. For other types of clusters, refer to Types of GKE clusters.
- The command above creates a cluster in the default network. If you want to specify a network, use the --network/subnet option. For more information, refer to Creating a regional cluster.

Create separate node pools for PD, TiKV, and TiDB:

gcloud container node-pools create pd --cluster tidb --machine-type n2-standard-4 --num-nodes=1 \
    --node-labels=dedicated=pd --node-taints=dedicated=pd:NoSchedule
gcloud container node-pools create tikv --cluster tidb --machine-type n2-highmem-8 --num-nodes=1 \
    --node-labels=dedicated=tikv --node-taints=dedicated=tikv:NoSchedule
gcloud container node-pools create tidb --cluster tidb --machine-type n2-standard-8 --num-nodes=1 \
    --node-labels=dedicated=tidb --node-taints=dedicated=tidb:NoSchedule

The process might take a few minutes.

Configure StorageClass

After the GKE cluster is created, the cluster contains three StorageClasses of different disk types.

standard: pd-standard disk type (default)
standard-rwo: pd-balanced disk type
premium-rwo: pd-ssd disk type (recommended)

To improve I/O write performance, it is recommended to configure nodelalloc and noatime in the mountOptions field of the StorageClass resource. For details, see TiDB Environment and System Configuration Check.

It is recommended to use the default pd-ssd storage class premium-rwo or to set up a customized storage class:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: pd-custom
provisioner: kubernetes.io/gce-pd
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters:
  type: pd-ssd
mountOptions:
  - nodelalloc
  - noatime

Note

Configuring nodelalloc and noatime is not supported for the default disk type pd-standard.

Use local storage

For the production environment, use zonal persistent disks.

If you need to simulate bare-metal performance, some Google Cloud instance types provide additional local store volumes. You can choose such instances for the TiKV node pool to achieve higher IOPS and lower latency.

Note

You cannot dynamically change StorageClass for a running TiDB cluster. For testing purposes, create a new TiDB cluster with the desired StorageClass.

GKE upgrade might cause node reconstruction. In such cases, data in the local storage might be lost. To avoid data loss, you need to back up TiKV data before node reconstruction. It is thus not recommended to use local disks in the production environment.

Create a node pool with local storage for TiKV:

gcloud container node-pools create tikv --cluster tidb --machine-type n2-highmem-8 --num-nodes=1 --local-ssd-count 1 \
  --node-labels dedicated=tikv --node-taints dedicated=tikv:NoSchedule

If the TiKV node pool already exists, you can either delete the old pool and then create a new one, or change the pool name to avoid conflict.

Deploy the local volume provisioner.
You need to use the local-volume-provisioner to discover and manage the local storage. Executing the following command deploys and creates a local-storage storage class:
```
kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/v1.6.5/manifests/gke/local-ssd-provision/local-ssd-provision.yaml
```
Use the local storage.
After the steps above, the local volume provisioner can discover all the local NVMe SSD disks in the cluster.
Modify tikv.storageClassName in the tidb-cluster.yaml file to local-storage.

Deploy TiDB Operator

To deploy TiDB Operator on GKE, refer to deploy TiDB Operator.

Deploy a TiDB cluster and the monitoring component

This section describes how to deploy a TiDB cluster and its monitoring component on GKE.

Create namespace

To create a namespace to deploy the TiDB cluster, run the following command:

kubectl create namespace tidb-cluster

Note

A namespace is a virtual cluster backed by the same physical cluster. This document takes tidb-cluster as an example. If you want to use other namespace, modify the corresponding arguments of -n or --namespace.

Deploy

First, download the sample TidbCluster and TidbMonitor configuration files:

curl -O https://raw.githubusercontent.com/pingcap/tidb-operator/v1.6.5/examples/gcp/tidb-cluster.yaml && \
curl -O https://raw.githubusercontent.com/pingcap/tidb-operator/v1.6.5/examples/gcp/tidb-monitor.yaml && \
curl -O https://raw.githubusercontent.com/pingcap/tidb-operator/v1.6.5/examples/gcp/tidb-dashboard.yaml

Refer to configure the TiDB cluster to further customize and configure the CR before applying.

To deploy the TidbCluster and TidbMonitor CR in the GKE cluster, run the following command:

kubectl create -f tidb-cluster.yaml -n tidb-cluster && \
kubectl create -f tidb-monitor.yaml -n tidb-cluster

After the yaml file above is applied to the Kubernetes cluster, TiDB Operator creates the desired TiDB cluster and its monitoring component according to the yaml file.

Note

If you need to deploy a TiDB cluster on ARM64 machines, refer to Deploy a TiDB Cluster on ARM64 Machines.

View the cluster status

To view the status of the starting TiDB cluster, run the following command:

kubectl get pods -n tidb-cluster

When all the Pods are in the Running or Ready state, the TiDB cluster is successfully started. For example:

NAME                              READY   STATUS    RESTARTS   AGE
tidb-discovery-5cb8474d89-n8cxk   1/1     Running   0          47h
tidb-monitor-6fbcc68669-dsjlc     3/3     Running   0          47h
tidb-pd-0                         1/1     Running   0          47h
tidb-pd-1                         1/1     Running   0          46h
tidb-pd-2                         1/1     Running   0          46h
tidb-tidb-0                       2/2     Running   0          47h
tidb-tidb-1                       2/2     Running   0          46h
tidb-tikv-0                       1/1     Running   0          47h
tidb-tikv-1                       1/1     Running   0          47h
tidb-tikv-2                       1/1     Running   0          47h

Access the TiDB database

After you deploy a TiDB cluster, you can access the TiDB database via MySQL client.

Prepare a bastion host

The LoadBalancer created for your TiDB cluster is an intranet LoadBalancer. You can create a bastion host in the cluster VPC to access the database.

gcloud compute instances create bastion \
    --machine-type=n1-standard-4 \
    --image-project=centos-cloud \
    --image-family=centos-7 \
    --zone=${your-region}-a

Note

${your-region}-a is the a zone in the region of the cluster, such as us-central1-a. You can also create the bastion host in other zones in the same region.

Install the MySQL client and connect

After the bastion host is created, you can connect to the bastion host via SSH and access the TiDB cluster via the MySQL client.

Connect to the bastion host via SSH:
```
gcloud compute ssh tidb@bastion
```
Install the MySQL client:
```
sudo yum install mysql -y
```

Connect the client to the TiDB cluster:

mysql --comments -h ${tidb-nlb-dnsname} -P 4000 -u root

${tidb-nlb-dnsname} is the LoadBalancer IP of the TiDB service. You can view the IP in the EXTERNAL-IP field of the kubectl get svc basic-tidb -n tidb-cluster execution result.

For example:

$ mysql --comments -h 10.128.15.243 -P 4000 -u root
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MySQL connection id is 7823
Server version: 8.0.11-TiDB-v8.5.5 TiDB Server (Apache License 2.0) Community Edition, MySQL 8.0 compatible

Copyright (c) 2000, 2022, Oracle and/or its affiliates.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MySQL [(none)]> show status;
+--------------------+--------------------------------------+
| Variable_name      | Value                                |
+--------------------+--------------------------------------+
| Ssl_cipher         |                                      |
| Ssl_cipher_list    |                                      |
| Ssl_verify_mode    | 0                                    |
| Ssl_version        |                                      |
| ddl_schema_version | 22                                   |
| server_id          | 717420dc-0eeb-4d4a-951d-0d393aff295a |
+--------------------+--------------------------------------+
6 rows in set (0.01 sec)

Note

By default, TiDB (versions starting from v4.0.2 and released before February 20, 2023) periodically shares usage details with PingCAP to help understand how to improve the product. For details about what is shared and how to disable the sharing, see Telemetry. Starting from February 20, 2023, the telemetry feature is disabled by default in newly released TiDB versions. See TiDB Release Timeline for details.

Access the Grafana monitoring dashboard

Obtain the LoadBalancer IP of Grafana:

kubectl -n tidb-cluster get svc basic-grafana

For example:

$ kubectl -n tidb-cluster get svc basic-grafana
NAME                     TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)               AGE
basic-grafana            LoadBalancer   10.15.255.169   34.123.168.114   3000:30657/TCP        35m

In the output above, the EXTERNAL-IP column is the LoadBalancer IP.

You can access the ${grafana-lb}:3000 address using your web browser to view monitoring metrics. Replace ${grafana-lb} with the LoadBalancer IP.

Note

The default Grafana username and password are both admin.

Access TiDB Dashboard Web UI

Obtain the LoadBalancer domain name of TiDB Dashboard by running the following command:

kubectl -n tidb-cluster get svc basic-tidb-dashboard-exposed

The following is an example:

$ kubectl -n tidb-cluster get svc basic-tidb-dashboard-exposed
NAME                     TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)               AGE
basic-tidb-dashboard-exposed            LoadBalancer   10.15.255.169   34.123.168.114   12333:30657/TCP        35m

You can view monitoring metrics of TiDB Dashboard by visiting ${EXTERNAL-IP}:12333 using your web browser.

Upgrade

To upgrade the TiDB cluster, execute the following command:

kubectl patch tc basic -n tidb-cluster --type merge -p '{"spec":{"version":"${version}"}}`.

The upgrade process does not finish immediately. You can watch the upgrade progress by executing kubectl get pods -n tidb-cluster --watch.

Scale out

Before scaling out the cluster, you need to scale out the corresponding node pool so that the new instances have enough resources for operation.

This section describes how to scale out the EKS node group and TiDB components.

Scale out GKE node group

The following example shows how to scale out the tikv node pool of the tidb cluster to 6 nodes:

gcloud container clusters resize tidb --node-pool tikv --num-nodes 2

Note

In the regional cluster, the nodes are created in 3 zones. Therefore, after scaling out, the number of nodes is 2 * 3 = 6.

Scale out TiDB components

After that, execute kubectl edit tc basic -n tidb-cluster and modify each component's replicas to the desired number of replicas. The scaling-out process is then completed.

For more information on managing node pools, refer to GKE Node pools.

Deploy TiFlash and TiCDC

TiFlash is the columnar storage extension of TiKV.

TiCDC is a tool for replicating the incremental data of TiDB by pulling TiKV change logs.

The two components are not required in the deployment. This section shows a quick start example.

Create new node pools

Create a node pool for TiFlash:

gcloud container node-pools create tiflash --cluster tidb --machine-type n1-highmem-8 --num-nodes=1 \
    --node-labels dedicated=tiflash --node-taints dedicated=tiflash:NoSchedule

Create a node pool for TiCDC:

gcloud container node-pools create ticdc --cluster tidb --machine-type n1-standard-4 --num-nodes=1 \
    --node-labels dedicated=ticdc --node-taints dedicated=ticdc:NoSchedule

Configure and deploy

To deploy TiFlash, configure spec.tiflash in tidb-cluster.yaml. For example:
```
spec:
  ...
  tiflash:
    baseImage: pingcap/tiflash
    maxFailoverCount: 0
    replicas: 1
    storageClaims:
    - resources:
        requests:
        storage: 100Gi
    nodeSelector:
      dedicated: tiflash
    tolerations:
    - effect: NoSchedule
      key: dedicated
      operator: Equal
      value: tiflash
```
To configure other parameters, refer to Configure a TiDB Cluster.
Warning
TiDB Operator automatically mounts PVs in the order of the configuration in the storageClaims list. Therefore, if you need to add disks for TiFlash, make sure that you add the disks only to the end of the original configuration in the list. In addition, you must not alter the order of the original configuration.

To deploy TiCDC, configure spec.ticdc in tidb-cluster.yaml. For example:

spec:
  ...
  ticdc:
    baseImage: pingcap/ticdc
    replicas: 1
    nodeSelector:
      dedicated: ticdc
    tolerations:
    - effect: NoSchedule
      key: dedicated
      operator: Equal
      value: ticdc

Modify replicas according to your needs.

Finally, execute kubectl -n tidb-cluster apply -f tidb-cluster.yaml to update the TiDB cluster configuration.

For detailed CR configuration, refer to API references and Configure a TiDB Cluster.

Configure TiDB monitoring

For more information, see Deploy monitoring and alerts for a TiDB cluster.

Note

TiDB monitoring does not persist data by default. To ensure long-term data availability, it is recommended to persist monitoring data. TiDB monitoring does not include Pod CPU, memory, or disk monitoring, nor does it have an alerting system. For more comprehensive monitoring and alerting, it is recommended to Set kube-prometheus and AlertManager.

Collect logs

System and application logs can be useful for troubleshooting issues and automating operations. By default, TiDB components output logs to the container's stdout and stderr, and log rotation is automatically performed based on the container runtime environment. When a Pod restarts, container logs will be lost. To prevent log loss, it is recommended to Collect logs of TiDB and its related components.

Deploy TiDB on Google Cloud GKE

Prerequisites

Recommended instance types and storage

Configure the Google Cloud service

Create a GKE cluster and node pool

Configure StorageClass

Use local storage

Deploy TiDB Operator

Deploy a TiDB cluster and the monitoring component

Create namespace

Deploy

View the cluster status

Access the TiDB database

Prepare a bastion host

Install the MySQL client and connect

Access the Grafana monitoring dashboard

Access TiDB Dashboard Web UI

Upgrade

Scale out

Scale out GKE node group

Scale out TiDB components

Deploy TiFlash and TiCDC

Create new node pools

Configure and deploy

Configure TiDB monitoring

Collect logs

Was this page helpful?