Best Practices for Using UUIDs as Primary Keys
UUIDs (Universally Unique Identifiers) are a popular alternative to auto-incrementing integers for primary keys in distributed databases. This document outlines the benefits of using UUIDs in TiDB, and offers best practices for storing and indexing them efficiently.
Overview of UUIDs
When used as a primary key, a UUID offers the following advantages compared with an AUTO_INCREMENT
integer:
- UUIDs can be generated on multiple systems without risking conflicts. In some cases, this means that the number of network trips to TiDB can be reduced, leading to improved performance.
- UUIDs are supported by most programming languages and database systems.
- When used as a part of a URL, a UUID is not vulnerable to enumeration attacks. In comparison, with an
AUTO_INCREMENT
number, it is possible to guess the invoice IDs or user IDs.
Best practices
This section describes best practices for storing and indexing UUIDs in TiDB.
Store as binary
The textual UUID format looks like this: ab06f63e-8fe7-11ec-a514-5405db7aad56
, which is a string of 36 characters. By using UUID_TO_BIN()
, the textual format can be converted into a binary format of 16 bytes. This allows you to store the text in a BINARY(16)
column. When retrieving the UUID, you can use the BIN_TO_UUID()
function to get back to the textual format.
UUID format binary order and clustered primary keys
The UUID_TO_BIN()
function can be used with one argument, the UUID or with two arguments where the second argument is a swap_flag
.
It is recommended to not set the swap_flag
with TiDB to avoid hotspots.
You can also explicitly set the CLUSTERED
option for UUID based primary keys to avoid hotspots.
To demonstrate the effect of the swap_flag
, here are two tables with an identical structure. The difference is that the data inserted into uuid_demo_1
uses UUID_TO_BIN(?, 0)
and uuid_demo_2
uses UUID_TO_BIN(?, 1)
.
In the screenshot of the Key Visualizer below, you can see that writes are concentrated in a single region of the uuid_demo_2
table that has the order of the fields swapped in the binary format.
CREATE TABLE `uuid_demo_1` (
`uuid` varbinary(16) NOT NULL,
`c1` varchar(255) NOT NULL,
PRIMARY KEY (`uuid`) CLUSTERED
)
CREATE TABLE `uuid_demo_2` (
`uuid` varbinary(16) NOT NULL,
`c1` varchar(255) NOT NULL,
PRIMARY KEY (`uuid`) CLUSTERED
)
MySQL compatibility
UUIDs can be used in MySQL as well. The BIN_TO_UUID()
and UUID_TO_BIN()
functions were introduced in MySQL 8.0. The UUID()
function is available in earlier MySQL versions as well.