Best Practices for Using UUIDs as Primary Keys

ran-huang
dveeden

UUIDs (Universally Unique Identifiers) are a popular alternative to auto-incrementing integers for primary keys in distributed databases. This document outlines the benefits of using UUIDs in TiDB, and offers best practices for storing and indexing them efficiently.

Overview of UUIDs

When used as a primary key, a UUID offers the following advantages compared with an AUTO_INCREMENT integer:

  • UUIDs can be generated on multiple systems without risking conflicts. In some cases, this means that the number of network trips to TiDB can be reduced, leading to improved performance.
  • UUIDs are supported by most programming languages and database systems.
  • When used as a part of a URL, a UUID is not vulnerable to enumeration attacks. In comparison, with an AUTO_INCREMENT number, it is possible to guess the invoice IDs or user IDs.

Best practices

This section describes best practices for storing and indexing UUIDs in TiDB.

Store as binary

The textual UUID format looks like this: ab06f63e-8fe7-11ec-a514-5405db7aad56, which is a string of 36 characters. By using UUID_TO_BIN(), the textual format can be converted into a binary format of 16 bytes. This allows you to store the text in a BINARY(16) column. When retrieving the UUID, you can use the BIN_TO_UUID() function to get back to the textual format.

UUID format binary order and clustered primary keys

The UUID_TO_BIN() function can be used with one argument, the UUID or with two arguments where the second argument is a swap_flag.

It is recommended to not set the swap_flag with TiDB to avoid hotspots.

You can also explicitly set the CLUSTERED option for UUID based primary keys to avoid hotspots.

To demonstrate the effect of the swap_flag, here are two tables with an identical structure. The difference is that the data inserted into uuid_demo_1 uses UUID_TO_BIN(?, 0) and uuid_demo_2 uses UUID_TO_BIN(?, 1).

In the screenshot of the Key Visualizer below, you can see that writes are concentrated in a single region of the uuid_demo_2 table that has the order of the fields swapped in the binary format.

Key Visualizer

CREATE TABLE `uuid_demo_1` ( `uuid` varbinary(16) NOT NULL, `c1` varchar(255) NOT NULL, PRIMARY KEY (`uuid`) CLUSTERED )
CREATE TABLE `uuid_demo_2` ( `uuid` varbinary(16) NOT NULL, `c1` varchar(255) NOT NULL, PRIMARY KEY (`uuid`) CLUSTERED )

MySQL compatibility

UUIDs can be used in MySQL as well. The BIN_TO_UUID() and UUID_TO_BIN() functions were introduced in MySQL 8.0. The UUID() function is available in earlier MySQL versions as well.

Was this page helpful?