Export Data from TiDB Cloud Serverless
TiDB Cloud Serverless Export (Beta) is a service that enables you to export data from a TiDB Cloud Serverless cluster to a local file or an external storage service. You can use the exported data for backup, migration, data analysis, or other purposes.
While you can also export data using tools such as mysqldump and TiDB Dumpling, TiDB Cloud Serverless Export offers a more convenient and efficient way to export data from a TiDB Cloud Serverless cluster. It brings the following benefits:
- Convenience: the export service provides a simple and easy-to-use way to export data from a TiDB Cloud Serverless cluster, eliminating the need for additional tools or resources.
- Isolation: the export service uses separate computing resources, ensuring isolation from the resources used by your online services.
- Consistency: the export service ensures the consistency of the exported data without causing locks, which does not affect your online services.
Export locations
You can export data to the following locations:
A local file
An external storage, including:
A local file
To export data from a TiDB Cloud Serverless cluster to a local file, you need to export data using the TiDB Cloud console or using the TiDB Cloud CLI, and then download the exported data using the TiDB Cloud CLI.
Exporting data to a local file has the following limitations:
- Downloading exported data using the TiDB Cloud console is not supported.
- Exported data is saved in the stashing area of TiDB Cloud and will expire after two days. You need to download the exported data in time.
- If the storage space of stashing area is full, you will not be able to export data to the local file.
Amazon S3
To export data to Amazon S3, you need to provide the following information:
- URI:
s3://<bucket-name>/<folder-path>/
- One of the following access credentials:
- An access key: make sure the access key has the
s3:PutObject
ands3:ListBucket
permissions. - A role ARN: make sure the role ARN (Amazon Resource Name) has the
s3:PutObject
ands3:ListBucket
permissions.
- An access key: make sure the access key has the
For more information, see Configure External Storage Access for TiDB Cloud Serverless.
Google Cloud Storage
To export data to Google Cloud Storage, you need to provide the following information:
- URI:
gs://<bucket-name>/<folder-path>/
- Access credential: a base64 encoded service account key for your bucket. Make sure the service account key has the
storage.objects.create
permission.
For more information, see Configure External Storage Access for TiDB Serverless.
Azure Blob Storage
To export data to Azure Blob Storage, you need to provide the following information:
- URI:
azure://<account-name>.blob.core.windows.net/<container-name>/<folder-path>/
orhttps://<account-name>.blob.core.windows.net/<container-name>/<folder-path>/
- Access credential: a shared access signature (SAS) token for your Azure Blob Storage container. Make sure the SAS token has the
Read
andWrite
permissions on theContainer
andObject
resources.
For more information, see Configure External Storage Access for TiDB Serverless.
Export options
Data filtering
- TiDB Cloud console supports exporting data with the selected databases and tables.
- TiDB Cloud CLI supports exporting data with SQL statements and table filters.
Data formats
You can export data in the following formats:
SQL
: export data in SQL format.CSV
: export data in CSV format. You can specify the following options:delimiter
: specify the delimiter used in the exported data. The default delimiter is"
.separator
: specify the character used to separate fields in the exported data. The default separator is,
.header
: specify whether to include a header row in the exported data. The default value istrue
.null-value
: specify the string that represents a NULL value in the exported data. The default value is\N
.
Parquet
: export data in Parquet format.
The schema and data are exported according to the following naming conventions:
Item | Not compressed | Compressed |
---|---|---|
Database schema | {database}-schema-create.sql | {database}-schema-create.sql.{compression-type} |
Table schema | {database}.{table}-schema.sql | {database}.{table}-schema.sql.{compression-type} |
Data | {database}.{table}.{0001}.{csv|parquet|sql} | {database}.{table}.{0001}.{csv|sql}.{compression-type} {database}.{table}.{0001}.{compression-type}.parquet |
Data compression
You can compress the exported CSV and SQL data using the following algorithms:
gzip
(default): compress the exported data withgzip
.snappy
: compress the exported data withsnappy
.zstd
: compress the exported data withzstd
.none
: do not compress the exporteddata
.
You can compress the exported Parquet data using the following algorithms:
zstd
(default): compress the Parquet file withzstd
.gzip
: compress the Parquet file withgzip
.snappy
: compress the Parquet file withsnappy
.none
: do not compress the Parquet file.
Data conversion
When exporting data to the Parquet format, the data conversion between TiDB Cloud Serverless and Parquet is as follows:
TiDB Cloud Serverless Type | Parquest primitive type | Parquet logical type |
---|---|---|
VARCHAR | BYTE_ARRAY | String(UTF8) |
TIME | BYTE_ARRAY | String(UTF8) |
TINYTEXT | BYTE_ARRAY | String(UTF8) |
MEDIUMTEXT | BYTE_ARRAY | String(UTF8) |
TEXT | BYTE_ARRAY | String(UTF8) |
LONGTEXT | BYTE_ARRAY | String(UTF8) |
SET | BYTE_ARRAY | String(UTF8) |
JSON | BYTE_ARRAY | String(UTF8) |
DATE | BYTE_ARRAY | String(UTF8) |
CHAR | BYTE_ARRAY | String(UTF8) |
VECTOR | BYTE_ARRAY | String(UTF8) |
DECIMAL(1<=p<=9) | INT32 | DECIMAL(p,s) |
DECIMAL(10<=p<=18) | INT64 | DECIMAL(p,s) |
DECIMAL(p>=19) | BYTE_ARRAY | String(UTF8) |
ENUM | BYTE_ARRAY | String(UTF8) |
TIMESTAMP | INT64 | TIMESTAMP(unit=MICROS,isAdjustedToUTC=false) |
DATETIME | INT64 | TIMESTAMP(unit=MICROS,isAdjustedToUTC=false) |
YEAR | INT32 | / |
TINYINT | INT32 | / |
UNSIGNED TINYINT | INT32 | / |
SMALLINT | INT32 | / |
UNSIGNED SMALLINT | INT32 | / |
MEDIUMINT | INT32 | / |
UNSIGNED MEDIUMINT | INT32 | / |
INT | INT32 | / |
UNSIGNED INT | FIXED_LEN_BYTE_ARRAY(9) | DECIMAL(20,0) |
BIGINT | FIXED_LEN_BYTE_ARRAY(9) | DECIMAL(20,0) |
UNSIGNED BIGINT | BYTE_ARRAY | String(UTF8) |
FLOAT | FLOAT | / |
DOUBLE | DOUBLE | / |
BLOB | BYTE_ARRAY | / |
TINYBLOB | BYTE_ARRAY | / |
MEDIUMBLOB | BYTE_ARRAY | / |
LONGBLOB | BYTE_ARRAY | / |
BINARY | BYTE_ARRAY | / |
VARBINARY | BYTE_ARRAY | / |
BIT | BYTE_ARRAY | / |
Examples
Export data to a local file
- Console
- CLI
Log in to the TiDB Cloud console and navigate to the Clusters page of your project.
Click the name of your target cluster to go to its overview page, and then click Import in the left navigation pane.
On the Import page, click Export Data to in the upper-right corner, then choose Local File from the drop-down list. Fill in the following parameters:
- Task Name: enter a name for the export task. The default value is
SNAPSHOT_{snapshot_time}
. - Exported Data: choose the databases and tables you want to export.
- Data Format: choose SQL, CSV, or Parquet.
- Compression: choose Gzip, Snappy, Zstd, or None.
- Task Name: enter a name for the export task. The default value is
Click Export.
After the export task is successful, you can copy the download command displayed in the export task detail, and then download the exported data by running the command in the TiDB Cloud CLI.
Create an export task:
ticloud serverless export create -c <cluster-id>You will get an export ID from the output.
After the export task is successful, download the exported data to your local file:
ticloud serverless export download -c <cluster-id> -e <export-id>For more information about the download command, see ticloud serverless export download.
Export data to Amazon S3
- Console
- CLI
Log in to the TiDB Cloud console and navigate to the Clusters page of your project.
Click the name of your target cluster to go to its overview page, and then click Import in the left navigation pane.
On the Import page, click Export Data to in the upper-right corner, then choose Amazon S3 from the drop-down list. Fill in the following parameters:
- Task Name: enter a name for the export task. The default value is
SNAPSHOT_{snapshot_time}
. - Exported Data: choose the databases and tables you want to export.
- Data Format: choose SQL, CSV, or Parquet.
- Compression: choose Gzip, Snappy, Zstd, or None.
- Folder URI: enter the URI of the Amazon S3 with the
s3://<bucket-name>/<folder-path>/
format. - Bucket Access: choose one of the following access credentials and then fill in the credential information:
- AWS Role ARN: enter the role ARN that has the permission to access the bucket. It is recommended to create the role ARN with AWS CloudFormation. For more information, see Configure External Storage Access for TiDB Cloud Serverless.
- AWS Access Key: enter the access key ID and access key secret that have the permission to access the bucket.
- Task Name: enter a name for the export task. The default value is
Click Export.
ticloud serverless export create -c <cluster-id> --target-type S3 --s3.uri <uri> --s3.access-key-id <access-key-id> --s3.secret-access-key <secret-access-key> --filter "database.table"
ticloud serverless export create -c <cluster-id> --target-type S3 --s3.uri <uri> --s3.role-arn <role-arn> --filter "database.table"
s3.uri
: the Amazon S3 URI with thes3://<bucket-name>/<folder-path>/
format.s3.access-key-id
: the access key ID of the user who has the permission to access the bucket.s3.secret-access-key
: the access key secret of the user who has the permission to access the bucket.s3.role-arn
: the role ARN that has the permission to access the bucket.
Export data to Google Cloud Storage
- Console
- CLI
Log in to the TiDB Cloud console and navigate to the Clusters page of your project.
Click the name of your target cluster to go to its overview page, and then click Import in the left navigation pane.
On the Import page, click Export Data to in the upper-right corner, and then choose Google Cloud Storage from the drop-down list. Fill in the following parameters:
- Task Name: enter a name for the export task. The default value is
SNAPSHOT_{snapshot_time}
. - Exported Data: choose the databases and tables you want to export.
- Data Format: choose SQL, CSV, or Parquet.
- Compression: choose Gzip, Snappy, Zstd, or None.
- Folder URI: enter the URI of the Google Cloud Storage with the
gs://<bucket-name>/<folder-path>/
format. - Bucket Access: upload the Google Cloud credentials file that has permission to access the bucket.
- Task Name: enter a name for the export task. The default value is
Click Export.
ticloud serverless export create -c <cluster-id> --target-type GCS --gcs.uri <uri> --gcs.service-account-key <service-account-key> --filter "database.table"
gcs.uri
: the URI of the Google Cloud Storage bucket in thegs://<bucket-name>/<folder-path>/
format.gcs.service-account-key
: the base64 encoded service account key.
Export data to Azure Blob Storage
- Console
- CLI
Log in to the TiDB Cloud console and navigate to the Clusters page of your project.
Click the name of your target cluster to go to its overview page, and then click Import in the left navigation pane.
On the Import page, click Export Data to in the upper-right corner, and then choose Azure Blob Storage from the drop-down list. Fill in the following parameters:
- Task Name: enter a name for the export task. The default value is
SNAPSHOT_{snapshot_time}
. - Exported Data: choose the databases and tables you want to export.
- Data Format: choose SQL, CSV, or Parquet.
- Compression: choose Gzip, Snappy, Zstd, or None.
- Folder URI: enter the URI of Azure Blob Storage with the
azure://<account-name>.blob.core.windows.net/<container-name>/<folder-path>/
format. - SAS Token: enter the SAS token that has the permission to access the container. It is recommended to create a SAS token with the Azure ARM template. For more information, see Configure External Storage Access for TiDB Cloud Serverless.
- Task Name: enter a name for the export task. The default value is
Click Export.
ticloud serverless export create -c <cluster-id> --target-type AZURE_BLOB --azblob.uri <uri> --azblob.sas-token <sas-token> --filter "database.table"
azblob.uri
: the URI of the Azure Blob Storage in the(azure|https)://<account-name>.blob.core.windows.net/<container-name>/<folder-path>/
format.azblob.sas-token
: the account SAS token of the Azure Blob Storage.
Cancel an export task
To cancel an ongoing export task, take the following steps:
- Console
- CLI
Log in to the TiDB Cloud console and navigate to the Clusters page of your project.
Click the name of your target cluster to go to its overview page, and then click Import in the left navigation pane.
On the Import page, click Export to view the export task list.
Choose the export task you want to cancel, and then click Action.
Choose Cancel in the drop-down list. Note that you can only cancel the export task that is in the Running status.
ticloud serverless export cancel -c <cluster-id> -e <export-id>
Pricing
The export service is free during the beta period. You only need to pay for the Request Units (RUs) generated during the export process of successful or canceled tasks. For failed export tasks, you will not be charged.