- Docs Home
- About TiDB
- Quick Start
- Deploy
- Software and Hardware Requirements
- Environment Configuration Checklist
- Topology Patterns
- Install and Start
- Verify Cluster Status
- Benchmarks Methods
- Migrate
- Maintain
- Monitor and Alert
- Troubleshoot
- TiDB Troubleshooting Map
- Identify Slow Queries
- Analyze Slow Queries
- SQL Diagnostics
- Identify Expensive Queries
- Statement Summary Tables
- Troubleshoot Hotspot Issues
- Troubleshoot Increased Read and Write Latency
- Troubleshoot Cluster Setup
- Troubleshoot High Disk I/O Usage
- Troubleshoot Lock Conflicts
- Troubleshoot TiFlash
- Troubleshoot Write Conflicts in Optimistic Transactions
- Performance Tuning
- System Tuning
- Software Tuning
- SQL Tuning
- Overview
- Understanding the Query Execution Plan
- SQL Optimization Process
- Overview
- Logic Optimization
- Physical Optimization
- Prepare Execution Plan Cache
- Control Execution Plans
- Tutorials
- TiDB Tools
- Overview
- Use Cases
- Download
- TiUP
- TiDB Operator
- Backup & Restore (BR)
- TiDB Binlog
- TiDB Lightning
- TiDB Data Migration
- TiCDC
- Dumpling
- sync-diff-inspector
- Loader
- Mydumper
- Syncer
- TiSpark
- Reference
- Cluster Architecture
- Key Monitoring Metrics
- Secure
- Privileges
- SQL
- SQL Language Structure and Syntax
- SQL Statements
ADD COLUMN
ADD INDEX
ADMIN
ADMIN CANCEL DDL
ADMIN CHECKSUM TABLE
ADMIN CHECK [TABLE|INDEX]
ADMIN SHOW DDL [JOBS|QUERIES]
ALTER DATABASE
ALTER INSTANCE
ALTER TABLE
ALTER USER
ANALYZE TABLE
BACKUP
BEGIN
CHANGE COLUMN
COMMIT
CHANGE DRAINER
CHANGE PUMP
CREATE [GLOBAL|SESSION] BINDING
CREATE DATABASE
CREATE INDEX
CREATE ROLE
CREATE SEQUENCE
CREATE TABLE LIKE
CREATE TABLE
CREATE USER
CREATE VIEW
DEALLOCATE
DELETE
DESC
DESCRIBE
DO
DROP [GLOBAL|SESSION] BINDING
DROP COLUMN
DROP DATABASE
DROP INDEX
DROP ROLE
DROP SEQUENCE
DROP STATS
DROP TABLE
DROP USER
DROP VIEW
EXECUTE
EXPLAIN ANALYZE
EXPLAIN
FLASHBACK TABLE
FLUSH PRIVILEGES
FLUSH STATUS
FLUSH TABLES
GRANT <privileges>
GRANT <role>
INSERT
KILL [TIDB]
LOAD DATA
LOAD STATS
MODIFY COLUMN
PREPARE
RECOVER TABLE
RENAME INDEX
RENAME TABLE
REPLACE
RESTORE
REVOKE <privileges>
REVOKE <role>
ROLLBACK
SELECT
SET DEFAULT ROLE
SET [NAMES|CHARACTER SET]
SET PASSWORD
SET ROLE
SET TRANSACTION
SET [GLOBAL|SESSION] <variable>
SHOW ANALYZE STATUS
SHOW [BACKUPS|RESTORES]
SHOW [GLOBAL|SESSION] BINDINGS
SHOW BUILTINS
SHOW CHARACTER SET
SHOW COLLATION
SHOW [FULL] COLUMNS FROM
SHOW CONFIG
SHOW CREATE SEQUENCE
SHOW CREATE TABLE
SHOW CREATE USER
SHOW DATABASES
SHOW DRAINER STATUS
SHOW ENGINES
SHOW ERRORS
SHOW [FULL] FIELDS FROM
SHOW GRANTS
SHOW INDEX [FROM|IN]
SHOW INDEXES [FROM|IN]
SHOW KEYS [FROM|IN]
SHOW MASTER STATUS
SHOW PLUGINS
SHOW PRIVILEGES
SHOW [FULL] PROCESSSLIST
SHOW PROFILES
SHOW PUMP STATUS
SHOW SCHEMAS
SHOW STATS_HEALTHY
SHOW STATS_HISTOGRAMS
SHOW STATS_META
SHOW STATUS
SHOW TABLE NEXT_ROW_ID
SHOW TABLE REGIONS
SHOW TABLE STATUS
SHOW [FULL] TABLES
SHOW [GLOBAL|SESSION] VARIABLES
SHOW WARNINGS
SHUTDOWN
SPLIT REGION
START TRANSACTION
TRACE
TRUNCATE
UPDATE
USE
- Data Types
- Functions and Operators
- Overview
- Type Conversion in Expression Evaluation
- Operators
- Control Flow Functions
- String Functions
- Numeric Functions and Operators
- Date and Time Functions
- Bit Functions and Operators
- Cast Functions and Operators
- Encryption and Compression Functions
- Information Functions
- JSON Functions
- Aggregate (GROUP BY) Functions
- Window Functions
- Miscellaneous Functions
- Precision Math
- List of Expressions for Pushdown
- Constraints
- Generated Columns
- SQL Mode
- Transactions
- Garbage Collection (GC)
- Views
- Partitioning
- Character Set and Collation
- System Tables
mysql
- INFORMATION_SCHEMA
- Overview
ANALYZE_STATUS
CHARACTER_SETS
CLUSTER_CONFIG
CLUSTER_HARDWARE
CLUSTER_INFO
CLUSTER_LOAD
CLUSTER_LOG
CLUSTER_SYSTEMINFO
COLLATIONS
COLLATION_CHARACTER_SET_APPLICABILITY
COLUMNS
DDL_JOBS
ENGINES
INSPECTION_RESULT
INSPECTION_RULES
INSPECTION_SUMMARY
KEY_COLUMN_USAGE
METRICS_SUMMARY
METRICS_TABLES
PARTITIONS
PROCESSLIST
SCHEMATA
SEQUENCES
SESSION_VARIABLES
SLOW_QUERY
STATISTICS
TABLES
TABLE_CONSTRAINTS
TABLE_STORAGE_STATS
TIDB_HOT_REGIONS
TIDB_INDEXES
TIDB_SERVERS_INFO
TIFLASH_REPLICA
TIKV_REGION_PEERS
TIKV_REGION_STATUS
TIKV_STORE_STATUS
USER_PRIVILEGES
VIEWS
METRICS_SCHEMA
- UI
- TiDB Dashboard
- Overview
- Maintain
- Access
- Overview Page
- Cluster Info Page
- Key Visualizer Page
- Metrics Relation Graph
- SQL Statements Analysis
- Slow Queries Page
- Cluster Diagnostics
- Search Logs Page
- Profile Instances Page
- Session Management and Configuration
- FAQ
- CLI
- Command Line Flags
- Configuration File Parameters
- System Variables
- Storage Engines
- Telemetry
- Errors Codes
- Table Filter
- Schedule Replicas by Topology Labels
- FAQs
- Release Notes
- All Releases
- v4.0
- v3.1
- v3.0
- v2.1
- v2.0
- v1.0
- Glossary
Deploy TiDB Using TiDB Ansible
For production environments, it is recommended that you deploy TiDB using TiUP. Since v4.0, PingCAP no longer provides support for deploying TiDB using TiDB Ansible (deprecated). If you really need to use it for deployment, be aware of any risk. You can import the TiDB cluster deployed by TiDB Ansible to TiUP.
If you only want to try out TiDB and explore new features, refer to Quick Start Guide for the TiDB Database Platform.
This guide describes how to deploy a TiDB cluster using TiDB Ansible. For the production environment, it is recommended to deploy TiDB using TiUP.
Overview
Ansible is an IT automation tool that can configure systems, deploy software, and orchestrate more advanced IT tasks such as continuous deployments or zero downtime rolling updates.
TiDB Ansible is a TiDB cluster deployment tool developed by PingCAP, based on Ansible playbook. TiDB Ansible enables you to quickly deploy a new TiDB cluster which includes PD, TiDB, TiKV, and the cluster monitoring modules.
You can use the TiDB Ansible configuration file to set up the cluster topology and complete all the following operation tasks:
- Initialize operating system parameters
- Deploy the whole TiDB cluster
- Start the TiDB cluster
- Stop the TiDB cluster
- Modify component configuration
- Scale the TiDB cluster
- Upgrade the component version
- Enable the cluster binlog
- Clean up data of the TiDB cluster
- Destroy the TiDB cluster
Prepare
Before you start, make sure you have:
Several target machines that meet the following requirements:
4 or more machines
A standard TiDB cluster contains 6 machines. You can use 4 machines for testing. For more details, see Software and Hardware Recommendations.
x86_64 architecture (AMD64) with CentOS 7.3 (64 bit) or later; Or x86_64 architecture (ARM64) with CentOS 7.6 1810 or later.
Network between machines
NoteWhen you deploy TiDB using TiDB Ansible, use SSD disks for the data directory of TiKV and PD nodes. Otherwise, it cannot pass the check. If you only want to try TiDB out and explore the features, it is recommended to deploy TiDB using Docker Compose on a single machine.
A control machine that meets the following requirements:
NoteThe control machine can be one of the target machines.
- CentOS 7.3 (64 bit) or later with Python 2.7 installed
- Access to the Internet
Step 1: Install system dependencies on the control machine
Log in to the control machine using the root
user account, and run the corresponding command according to your operating system.
If you use a control machine installed with CentOS 7, run the following command:
yum -y install epel-release git curl sshpass && \ yum -y install python2-pip
If you use a control machine installed with Ubuntu, run the following command:
apt-get -y install git curl sshpass python-pip
Step 2: Create the tidb
user on the control machine and generate the SSH key
Make sure you have logged in to the control machine using the root
user account, and then run the following command.
Create the
tidb
user.useradd -m -d /home/tidb tidb
Set a password for the
tidb
user account.passwd tidb
Configure sudo without password for the
tidb
user account by addingtidb ALL=(ALL) NOPASSWD: ALL
to the end of the sudo file:visudo
tidb ALL=(ALL) NOPASSWD: ALL
Generate the SSH key.
Execute the
su
command to switch the user fromroot
totidb
.su - tidb
Create the SSH key for the
tidb
user account and hit the Enter key whenEnter passphrase
is prompted. After successful execution, the SSH private key file is/home/tidb/.ssh/id_rsa
, and the SSH public key file is/home/tidb/.ssh/id_rsa.pub
.ssh-keygen -t rsa
Generating public/private rsa key pair. Enter file in which to save the key (/home/tidb/.ssh/id_rsa): Created directory '/home/tidb/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/tidb/.ssh/id_rsa. Your public key has been saved in /home/tidb/.ssh/id_rsa.pub. The key fingerprint is: SHA256:eIBykszR1KyECA/h0d7PRKz4fhAeli7IrVphhte7/So tidb@172.16.10.49 The key's randomart image is: +---[RSA 2048]----+ |=+o+.o. | |o=o+o.oo | | .O.=.= | | . B.B + | |o B * B S | | * + * + | | o + . | | o E+ . | |o ..+o. | +----[SHA256]-----+
Step 3: Download TiDB Ansible to the control machine
Log in to the control machine using the tidb
user account and enter the /home/tidb
directory. Run the following command to download the TAG version corresponding to TiDB Ansible 4.0 from the TiDB Ansible project. The default folder name is tidb-ansible
.
git clone -b $tag https://github.com/pingcap/tidb-ansible.git
- Replace
$tag
with the value of the chosen TAG version. For example,v4.0.0-beta.2
. - To deploy and upgrade TiDB clusters, use the corresponding version of
tidb-ansible
. If you only modify the version in theinventory.ini
file, errors might occur. - It is required to download
tidb-ansible
to the/home/tidb
directory using thetidb
user account. If you download it to the/root
directory, a privilege issue occurs.
If you have questions regarding which version to use, email to info@pingcap.com for more information or file an issue.
Step 4: Install TiDB Ansible and its dependencies on the control machine
Make sure you have logged in to the control machine using the tidb
user account.
It is required to use pip
to install Ansible and its dependencies, otherwise a compatibility issue occurs. Currently, the release-4.0 branch of TiDB Ansible is compatible with Ansible 2.5 ~ 2.7.11 (2.5 ≤ Ansible ≤ 2.7.11).
Install TiDB Ansible and the dependencies on the control machine:
cd /home/tidb/tidb-ansible && \ sudo pip install -r ./requirements.txt
The version information of Ansible and dependencies is in the
tidb-ansible/requirements.txt
file.View the version of TiDB Ansible:
ansible --version
ansible 2.7.11
Step 5: Configure the SSH mutual trust and sudo rules on the control machine
Make sure you have logged in to the control machine using the tidb
user account.
Add the IPs of your target machines to the
[servers]
section of thehosts.ini
file.cd /home/tidb/tidb-ansible && \ vi hosts.ini
[servers] 172.16.10.1 172.16.10.2 172.16.10.3 172.16.10.4 172.16.10.5 172.16.10.6 [all:vars] username = tidb ntp_server = pool.ntp.org
Run the following command and input the
root
user account password of your target machines.ansible-playbook -i hosts.ini create_users.yml -u root -k
This step creates the
tidb
user account on the target machines, configures the sudo rules and the SSH mutual trust between the control machine and the target machines.
To configure the SSH mutual trust and sudo without password manually, see How to manually configure the SSH mutual trust and sudo without password.
Step 6: Install the NTP service on the target machines
If the time and time zone of all your target machines are same, the NTP service is on and is normally synchronizing time, you can ignore this step. See How to check whether the NTP service is normal.
Make sure you have logged in to the control machine using the tidb
user account, run the following command:
cd /home/tidb/tidb-ansible && \
ansible-playbook -i hosts.ini deploy_ntp.yml -u tidb -b
The NTP service is installed and started using the software repository that comes with the system on the target machines. The default NTP server list in the installation package is used. The related server
parameter is in the /etc/ntp.conf
configuration file.
To make the NTP service start synchronizing as soon as possible, the system executes the ntpdate
command to set the local date and time by polling ntp_server
in the hosts.ini
file. The default server is pool.ntp.org
, and you can also replace it with your NTP server.
Step 7: Configure the CPUfreq governor mode on the target machine
For details about CPUfreq, see the CPUfreq Governor documentation.
Set the CPUfreq governor mode to performance
to make full use of CPU performance.
Check the governor modes supported by the system
You can run the cpupower frequency-info --governors
command to check the governor modes which the system supports:
cpupower frequency-info --governors
analyzing CPU 0:
available cpufreq governors: performance powersave
Taking the above code for example, the system supports the performance
and powersave
modes.
As the following shows, if it returns Not Available
, it means that the current system does not support CPUfreq configuration and you can skip this step.
cpupower frequency-info --governors
analyzing CPU 0:
available cpufreq governors: Not Available
Check the current governor mode
You can run the cpupower frequency-info --policy
command to check the current CPUfreq governor mode:
cpupower frequency-info --policy
analyzing CPU 0:
current policy: frequency should be within 1.20 GHz and 3.20 GHz.
The governor "powersave" may decide which speed to use
within this range.
As the above code shows, the current mode is powersave
in this example.
Change the governor mode
You can use either of the following two methods to change the governor mode. In the above example, the current governor mode is powersave
and the following commands change it to performance
.
Use the
cpupower frequency-set --governor
command to change the current mode:cpupower frequency-set --governor performance
Run the following command to set the mode on the target machine in batches:
ansible -i hosts.ini all -m shell -a "cpupower frequency-set --governor performance" -u tidb -b
Step 8: Mount the data disk ext4 filesystem with options on the target machines
Log in to the target machines using the root
user account.
Format your data disks to the ext4 filesystem and add the nodelalloc
and noatime
mount options to the filesystem. It is required to add the nodelalloc
option, or else the Ansible deployment cannot pass the test. The noatime
option is optional.
If your data disks have been formatted to ext4 and have added the mount options, you can uninstall it by running the umount /dev/nvme0n1p1
command, follow the steps starting from editing the /etc/fstab
file, and add the options again to the filesystem.
Take the /dev/nvme0n1
data disk as an example:
View the data disk.
fdisk -l
Disk /dev/nvme0n1: 1000 GB
Create the partition table.
parted -s -a optimal /dev/nvme0n1 mklabel gpt -- mkpart primary ext4 1 -1
NoteUse the
lsblk
command to view the device number of the partition: for a nvme disk, the generated device number is usuallynvme0n1p1
; for a regular disk (for example,/dev/sdb
), the generated device number is usuallysdb1
.Format the data disk to the ext4 filesystem.
mkfs.ext4 /dev/nvme0n1p1
View the partition UUID of the data disk.
In this example, the UUID of
nvme0n1p1
isc51eb23b-195c-4061-92a9-3fad812cc12f
.lsblk -f
NAME FSTYPE LABEL UUID MOUNTPOINT sda ├─sda1 ext4 237b634b-a565-477b-8371-6dff0c41f5ab /boot ├─sda2 swap f414c5c0-f823-4bb1-8fdf-e531173a72ed └─sda3 ext4 547909c1-398d-4696-94c6-03e43e317b60 / sr0 nvme0n1 └─nvme0n1p1 ext4 c51eb23b-195c-4061-92a9-3fad812cc12f
Edit the
/etc/fstab
file and add the mount options.vi /etc/fstab
UUID=c51eb23b-195c-4061-92a9-3fad812cc12f /data1 ext4 defaults,nodelalloc,noatime 0 2
Mount the data disk.
mkdir /data1 && \ mount -a
Check using the following command.
mount -t ext4
/dev/nvme0n1p1 on /data1 type ext4 (rw,noatime,nodelalloc,data=ordered)
If the filesystem is ext4 and
nodelalloc
is included in the mount options, you have successfully mount the data disk ext4 filesystem with options on the target machines.
Step 9: Edit the inventory.ini
file to orchestrate the TiDB cluster
Log in to the control machine using the tidb
user account, and edit the tidb-ansible/inventory.ini
file to orchestrate the TiDB cluster. The standard TiDB cluster contains 6 machines: 2 TiDB instances, 3 PD instances, and 3 TiKV instances.
- Deploy at least 3 instances for TiKV.
- Do not deploy TiKV together with TiDB or PD on the same machine.
- Use the first TiDB machine as the monitoring machine.
It is required to use the internal IP address to deploy. If the SSH port of the target machines is not the default 22 port, you need to add the ansible_port
variable. For example, TiDB1 ansible_host=172.16.10.1 ansible_port=5555
.
You can choose one of the following two types of cluster topology according to your scenario:
The cluster topology of a single TiKV instance on each TiKV node
In most cases, it is recommended to deploy one TiKV instance on each TiKV node for better performance. However, if the CPU and memory of your TiKV machines are much better than the required in Hardware and Software Requirements, and you have more than two disks in one node or the capacity of one SSD is larger than 2 TB, you can deploy no more than 2 TiKV instances on a single TiKV node.
The cluster topology of multiple TiKV instances on each TiKV node
Option 1: Use the cluster topology of a single TiKV instance on each TiKV node
Name | Host IP | Services |
---|---|---|
node1 | 172.16.10.1 | PD1, TiDB1 |
node2 | 172.16.10.2 | PD2, TiDB2 |
node3 | 172.16.10.3 | PD3 |
node4 | 172.16.10.4 | TiKV1 |
node5 | 172.16.10.5 | TiKV2 |
node6 | 172.16.10.6 | TiKV3 |
[tidb_servers]
172.16.10.1
172.16.10.2
[pd_servers]
172.16.10.1
172.16.10.2
172.16.10.3
[tikv_servers]
172.16.10.4
172.16.10.5
172.16.10.6
[monitoring_servers]
172.16.10.1
[grafana_servers]
172.16.10.1
[monitored_servers]
172.16.10.1
172.16.10.2
172.16.10.3
172.16.10.4
172.16.10.5
172.16.10.6
Option 2: Use the cluster topology of multiple TiKV instances on each TiKV node
Take two TiKV instances on each TiKV node as an example:
Name | Host IP | Services |
---|---|---|
node1 | 172.16.10.1 | PD1, TiDB1 |
node2 | 172.16.10.2 | PD2, TiDB2 |
node3 | 172.16.10.3 | PD3 |
node4 | 172.16.10.4 | TiKV1-1, TiKV1-2 |
node5 | 172.16.10.5 | TiKV2-1, TiKV2-2 |
node6 | 172.16.10.6 | TiKV3-1, TiKV3-2 |
[tidb_servers]
172.16.10.1
172.16.10.2
[pd_servers]
172.16.10.1
172.16.10.2
172.16.10.3
# Note: To use labels in TiKV, you must also configure location_labels for PD at the same time.
# You must also configure status ports in the multi-instance scenario.
[tikv_servers]
TiKV1-1 ansible_host=172.16.10.4 deploy_dir=/data1/deploy tikv_port=20171 tikv_status_port=20181 labels="host=tikv1"
TiKV1-2 ansible_host=172.16.10.4 deploy_dir=/data2/deploy tikv_port=20172 tikv_status_port=20182 labels="host=tikv1"
TiKV2-1 ansible_host=172.16.10.5 deploy_dir=/data1/deploy tikv_port=20171 tikv_status_port=20181 labels="host=tikv2"
TiKV2-2 ansible_host=172.16.10.5 deploy_dir=/data2/deploy tikv_port=20172 tikv_status_port=20182 labels="host=tikv2"
TiKV3-1 ansible_host=172.16.10.6 deploy_dir=/data1/deploy tikv_port=20171 tikv_status_port=20181 labels="host=tikv3"
TiKV3-2 ansible_host=172.16.10.6 deploy_dir=/data2/deploy tikv_port=20172 tikv_status_port=20182 labels="host=tikv3"
[monitoring_servers]
172.16.10.1
[grafana_servers]
172.16.10.1
[monitored_servers]
172.16.10.1
172.16.10.2
172.16.10.3
172.16.10.4
172.16.10.5
172.16.10.6
......
# Note: For labels in TiKV to work, you must also configure location_labels for PD when deploying the cluster.
[pd_servers:vars]
location_labels = ["host"]
Edit the parameters in the service configuration file:
For the cluster topology of multiple TiKV instances on each TiKV node, you need to edit the
capacity
parameter underblock-cache-size
intidb-ansible/conf/tikv.yml
:storage: block-cache: capacity: "1GB"
Note- The number of TiKV instances is the number of TiKV processes on each server.
- Recommended configuration:
capacity
= MEM_TOTAL * 0.5 / the number of TiKV instances
For the cluster topology of multiple TiKV instances on each TiKV node, you need to edit the
high-concurrency
,normal-concurrency
andlow-concurrency
parameters in thetidb-ansible/conf/tikv.yml
file:readpool: coprocessor: # Notice: if CPU_NUM > 8, default thread pool size for coprocessors # will be set to CPU_NUM * 0.8. # high-concurrency: 8 # normal-concurrency: 8 # low-concurrency: 8
NoteRecommended configuration: the number of TiKV instances * the parameter value = the number of CPU cores * 0.8.
If multiple TiKV instances are deployed on a same physical disk, edit the
capacity
parameter inconf/tikv.yml
:raftstore: capacity: 0
NoteRecommended configuration:
capacity
= total disk capacity / the number of TiKV instances. For example,capacity: "100GB"
.
Step 10: Edit variables in the inventory.ini
file
This step describes how to edit the variable of deployment directory and other variables in the inventory.ini
file.
Configure the deployment directory
Edit the deploy_dir
variable to configure the deployment directory.
The global variable is set to /home/tidb/deploy
by default, and it applies to all services. If the data disk is mounted on the /data1
directory, you can set it to /data1/deploy
. For example:
## Global variables
[all:vars]
deploy_dir = /data1/deploy
To separately set the deployment directory for a service, you can configure the host variable while configuring the service host list in the inventory.ini
file. It is required to add the first column alias, to avoid confusion in scenarios of mixed services deployment.
TiKV1-1 ansible_host=172.16.10.4 deploy_dir=/data1/deploy
Edit other variables (Optional)
To enable the following control variables, use the capitalized True
. To disable the following control variables, use the capitalized False
.
Variable Name | Description |
---|---|
cluster_name | the name of a cluster, adjustable |
cpu_architecture | CPU architecture. amd64 by default, arm64 optional |
tidb_version | the version of TiDB, configured by default in TiDB Ansible branches |
process_supervision | the supervision way of processes, systemd by default, supervise optional |
timezone | the global default time zone configured when a new TiDB cluster bootstrap is initialized; you can edit it later using the global time_zone system variable and the session time_zone system variable as described in Time Zone Support; the default value is Asia/Shanghai and see the list of time zones for more optional values |
enable_firewalld | to enable the firewall, closed by default; to enable it, add the ports in network requirements to the allowlist |
enable_ntpd | to monitor the NTP service of the managed node, True by default; do not close it |
set_hostname | to edit the hostname of the managed node based on the IP, False by default |
enable_binlog | whether to deploy Pump and enable the binlog, False by default, dependent on the Kafka cluster; see the zookeeper_addrs variable |
zookeeper_addrs | the zookeeper address of the binlog Kafka cluster |
deploy_without_tidb | the Key-Value mode, deploy only PD, TiKV and the monitoring service, not TiDB; set the IP of the tidb_servers host group to null in the inventory.ini file |
alertmanager_target | optional: If you have deployed alertmanager separately, you can configure this variable using the alertmanager_host:alertmanager_port format |
grafana_admin_user | the username of Grafana administrator; default admin |
grafana_admin_password | the password of Grafana administrator account; default admin ; used to import Dashboard and create the API key using TiDB Ansible; update this variable if you have modified it through Grafana web |
collect_log_recent_hours | to collect the log of recent hours; default the recent 2 hours |
enable_bandwidth_limit | to set a bandwidth limit when pulling the diagnostic data from the target machines to the control machine; used together with the collect_bandwidth_limit variable |
collect_bandwidth_limit | the limited bandwidth when pulling the diagnostic data from the target machines to the control machine; unit: Kbit/s; default 10000, indicating 10Mb/s; for the cluster topology of multiple TiKV instances on each TiKV node, you need to divide the number of the TiKV instances on each TiKV node |
prometheus_storage_retention | the retention time of the monitoring data of Prometheus (30 days by default); this is a new configuration in the group_vars/monitoring_servers.yml file in 2.1.7, 3.0 and the later tidb-ansible versions |
Step 11: Deploy the TiDB cluster
When ansible-playbook
runs Playbook, the default concurrent number is 5. If many target machines are deployed, you can add the -f
parameter to specify the concurrency, such as ansible-playbook deploy.yml -f 10
.
The following example uses tidb
as the user who runs the service.
Edit the
tidb-ansible/inventory.ini
file to make sureansible_user = tidb
.## Connection # ssh via normal user ansible_user = tidb
NoteDo not configure
ansible_user
toroot
, becausetidb-ansible
limits the user that runs the service to the normal user.Run the following command and if all servers return
tidb
, then the SSH mutual trust is successfully configured:ansible -i inventory.ini all -m shell -a 'whoami'
Run the following command and if all servers return
root
, then sudo without password of thetidb
user is successfully configured:ansible -i inventory.ini all -m shell -a 'whoami' -b
Run the
local_prepare.yml
playbook and download TiDB binary to the control machine.ansible-playbook local_prepare.yml
Initialize the system environment and modify the kernel parameters.
ansible-playbook bootstrap.yml
Deploy the TiDB cluster software.
ansible-playbook deploy.yml
NoteYou can use the
Report
button on the Grafana Dashboard to generate the PDF file. This function depends on thefontconfig
package and English fonts. To use this function, log in to thegrafana_servers
machine and install it using the following command:>
sudo yum install fontconfig open-sans-fonts
Start the TiDB cluster.
ansible-playbook start.yml
Test the TiDB cluster
Because TiDB is compatible with MySQL, you must use the MySQL client to connect to TiDB directly. It is recommended to configure load balancing to provide uniform SQL interface.
Connect to the TiDB cluster using the MySQL client.
mysql -u root -h 172.16.10.1 -P 4000
NoteThe default port of TiDB service is
4000
.Access the monitoring platform using a web browser.
- Address: http://172.16.10.1:3000
- Default account and password:
admin
;admin
By default, TiDB periodically shares usage details with PingCAP to help understand how to improve the product. For details about what is shared and how to disable the sharing, see Telemetry.
Deployment FAQs
This section lists the common questions about deploying TiDB using TiDB Ansible.
How to customize the port?
Edit the inventory.ini
file and add the following host variable after the IP of the corresponding service:
Component | Variable Port | Default Port | Description |
---|---|---|---|
TiDB | tidb_port | 4000 | the communication port for the application and DBA tools |
TiDB | tidb_status_port | 10080 | the communication port to report TiDB status |
TiKV | tikv_port | 20160 | the TiKV communication port |
TiKV | tikv_status_port | 20180 | the communication port to report the TiKV status |
PD | pd_client_port | 2379 | the communication port between TiDB and PD |
PD | pd_peer_port | 2380 | the inter-node communication port within the PD cluster |
Pump | pump_port | 8250 | the pump communication port |
Prometheus | prometheus_port | 9090 | the communication port for the Prometheus service |
Pushgateway | pushgateway_port | 9091 | the aggregation and report port for TiDB, TiKV, and PD monitor |
Node_exporter | node_exporter_port | 9100 | the communication port to report the system information of every TiDB cluster node |
Grafana | grafana_port | 3000 | the port for the external Web monitoring service and client (Browser) access |
Kafka_exporter | kafka_exporter_port | 9308 | the communication port for Kafka_exporter, used to monitor the binlog Kafka cluster |
How to customize the deployment directory?
Edit the inventory.ini
file and add the following host variable after the IP of the corresponding service:
Component | Variable Directory | Default Directory | Description |
---|---|---|---|
Global | deploy_dir | /home/tidb/deploy | the deployment directory |
TiDB | tidb_log_dir | {{ deploy_dir }}/log | the TiDB log directory |
TiKV | tikv_log_dir | {{ deploy_dir }}/log | the TiKV log directory |
TiKV | tikv_data_dir | {{ deploy_dir }}/data | the data directory |
TiKV | wal_dir | "" | the rocksdb write-ahead log directory, consistent with the TiKV data directory when the value is null |
TiKV | raftdb_path | "" | the raftdb directory, being tikv_data_dir/raft when the value is null |
PD | pd_log_dir | {{ deploy_dir }}/log | the PD log directory |
PD | pd_data_dir | {{ deploy_dir }}/data.pd | the PD data directory |
Pump | pump_log_dir | {{ deploy_dir }}/log | the Pump log directory |
Pump | pump_data_dir | {{ deploy_dir }}/data.pump | the Pump data directory |
Prometheus | prometheus_log_dir | {{ deploy_dir }}/log | the Prometheus log directory |
Prometheus | prometheus_data_dir | {{ deploy_dir }}/data.metrics | the Prometheus data directory |
Pushgateway | pushgateway_log_dir | {{ deploy_dir }}/log | the pushgateway log directory |
Node_exporter | node_exporter_log_dir | {{ deploy_dir }}/log | the node_exporter log directory |
Grafana | grafana_log_dir | {{ deploy_dir }}/log | the Grafana log directory |
Grafana | grafana_data_dir | {{ deploy_dir }}/data.grafana | the Grafana data directory |
How to check whether the NTP service is normal?
Run the following command. If it returns
running
, then the NTP service is running:sudo systemctl status ntpd.service
ntpd.service - Network Time Service Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled; vendor preset: disabled) Active: active (running) since 一 2017-12-18 13:13:19 CST; 3s ago
Run the ntpstat command. If it returns
synchronised to NTP server
(synchronizing with the NTP server), then the synchronization process is normal.ntpstat
synchronised to NTP server (85.199.214.101) at stratum 2 time correct to within 91 ms polling server every 1024 s
For the Ubuntu system, you need to install the ntpstat
package.
The following condition indicates the NTP service is not synchronizing normally:
ntpstat
unsynchronised
The following condition indicates the NTP service is not running normally:
ntpstat
Unable to talk to NTP daemon. Is it running?
To make the NTP service start synchronizing as soon as possible, run the following command. You can replace
pool.ntp.org
with other NTP servers.sudo systemctl stop ntpd.service && \ sudo ntpdate pool.ntp.org && \ sudo systemctl start ntpd.service
To install the NTP service manually on the CentOS 7 system, run the following command:
sudo yum install ntp ntpdate && \ sudo systemctl start ntpd.service && \ sudo systemctl enable ntpd.service
How to modify the supervision method of a process from supervise
to systemd
?
Run the following command:
process supervision, [systemd, supervise]
process_supervision = systemd
For versions earlier than TiDB 1.0.4, the TiDB Ansible supervision method of a process is supervise
by default. The previously installed cluster can remain the same. If you need to change the supervision method to systemd
, stop the cluster and run the following command:
ansible-playbook stop.yml && \
ansible-playbook deploy.yml -D && \
ansible-playbook start.yml
How to manually configure the SSH mutual trust and sudo without password?
Log in to the target machine respectively using the
root
user account, create thetidb
user and set the login password.useradd tidb && \ passwd tidb
To configure sudo without password, run the following command, and add
tidb ALL=(ALL) NOPASSWD: ALL
to the end of the file:visudo
tidb ALL=(ALL) NOPASSWD: ALL
Use the
tidb
user to log in to the control machine, and run the following command. Replace172.16.10.61
with the IP of your target machine, and enter thetidb
user password of the target machine as prompted. Successful execution indicates that SSH mutual trust is already created. This applies to other machines as well.ssh-copy-id -i ~/.ssh/id_rsa.pub 172.16.10.61
Log in to the control machine using the
tidb
user account, and log in to the IP of the target machine using SSH. If you do not need to enter the password and can successfully log in, then the SSH mutual trust is successfully configured.ssh 172.16.10.61
[tidb@172.16.10.61 ~]$
After you login to the target machine using the
tidb
user, run the following command. If you do not need to enter the password and can switch to theroot
user, then sudo without password of thetidb
user is successfully configured.sudo -su root
[root@172.16.10.61 tidb]#
Error: You need to install jmespath prior to running json_query filter
See Install TiDB Ansible and its dependencies on the control machine and use
pip
to install TiDB Ansible and the corresponding dependencies in the control machine. Thejmespath
dependent package is installed by default.Run the following command to check whether
jmespath
is successfully installed:pip show jmespath
Enter
import jmespath
in the Python interactive window of the control machine.- If no error displays, the dependency is successfully installed.
- If the
ImportError: No module named jmespath
error displays, the Pythonjmespath
module is not successfully installed.
python
Python 2.7.5 (default, Nov 6 2016, 00:28:07) [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import jmespath
The zk: node does not exist
error when starting Pump/Drainer
Check whether the zookeeper_addrs
configuration in inventory.ini
is the same with the configuration in the Kafka cluster, and whether the namespace is filled in. The description about namespace configuration is as follows:
# ZooKeeper connection string (see ZooKeeper docs for details).
# ZooKeeper address of the Kafka cluster. Example:
# zookeeper_addrs = "192.168.0.11:2181,192.168.0.12:2181,192.168.0.13:2181"
# You can also append an optional chroot string to the URLs to specify the root directory for all Kafka znodes. Example:
# zookeeper_addrs = "192.168.0.11:2181,192.168.0.12:2181,192.168.0.13:2181/kafka/123"
- Overview
- Prepare
- Step 1: Install system dependencies on the control machine
- Step 2: Create the tidb user on the control machine and generate the SSH key
- Step 3: Download TiDB Ansible to the control machine
- Step 4: Install TiDB Ansible and its dependencies on the control machine
- Step 5: Configure the SSH mutual trust and sudo rules on the control machine
- Step 6: Install the NTP service on the target machines
- Step 7: Configure the CPUfreq governor mode on the target machine
- Step 8: Mount the data disk ext4 filesystem with options on the target machines
- Step 9: Edit the inventory.ini file to orchestrate the TiDB cluster
- Step 10: Edit variables in the inventory.ini file
- Step 11: Deploy the TiDB cluster
- Test the TiDB cluster
- Deployment FAQs
- How to customize the port?
- How to customize the deployment directory?
- How to check whether the NTP service is normal?
- How to modify the supervision method of a process from supervise to systemd?
- How to manually configure the SSH mutual trust and sudo without password?
- Error: You need to install jmespath prior to running json_query filter
- The zk: node does not exist error when starting Pump/Drainer