Import Data from Amazon S3 into TiDB Cloud Premium
This document describes how to import CSV files from Amazon Simple Storage Service (Amazon S3) into TiDB Cloud Premium instances. The steps reflect the current private preview user interface and serve as an initial framework for the upcoming public preview launch.
Limitations
- To ensure data consistency, TiDB Cloud Premium allows importing CSV files into empty tables only. If the target table already contains data, import into a staging table and then copy the rows using the
INSERT ... SELECTstatement. - During the private preview, the user interface currently supports Amazon S3 as the only storage provider. Support for additional providers will be added in future releases.
- Each import job maps a single source pattern to one destination table.
Step 1. Prepare the CSV files
- If a CSV file is larger than 256 MiB, consider splitting it into smaller files around 256 MiB so TiDB Cloud Premium can process them in parallel.
- Name your CSV files according to the Dumpling naming conventions:
- Full-table files: use the
${db_name}.${table_name}.csvformat. - Sharded files: append numeric suffixes, such as
${db_name}.${table_name}.000001.csv. - Compressed files: use the
${db_name}.${table_name}.${suffix}.csv.${compress}format.
- Full-table files: use the
- Optional schema files (
${db_name}-schema-create.sql,${db_name}.${table_name}-schema.sql) help TiDB Cloud Premium create databases and tables automatically.
Step 2. Create target schemas (optional)
If you want TiDB Cloud Premium to create the databases and tables automatically, place the schema files generated by Dumpling in the same S3 directory. Otherwise, create the databases and tables manually in TiDB Cloud Premium before running the import.
Step 3. Configure access to Amazon S3
To allow TiDB Cloud Premium to read your bucket, use either of the following methods:
- Provide an AWS Role ARN that trusts TiDB Cloud and grants the
s3:GetObjectands3:ListBucketpermissions on the relevant paths. - Provide an AWS access key (access key ID and secret access key) with equivalent permissions.
The wizard includes a helper link labeled Click here to create a new one with AWS CloudFormation. Follow this link if you need TiDB Cloud Premium to pre-fill a CloudFormation stack that creates the role for you.
Step 4. Import CSV files from Amazon S3
In the TiDB Cloud console, navigate to the TiDB Instances page, and then click the name of your TiDB instance.
In the left navigation pane, click Data > Import, and choose Import data from Cloud Storage.
In the Source Connection dialog:
- Set Storage Provider to Amazon S3.
- Enter the Source Files URI for a single file (
s3://bucket/path/file.csv) or for a folder (s3://bucket/path/). - Choose AWS Role ARN or AWS Access Key and provide the credentials.
- Click Test Bucket Access to validate connectivity. <!--Todo-- Known preview issue: the button returns to the idle state without a success toast.-->
Click Next and provide the TiDB SQL username and password for the import job. Optionally, test the connection.
Review the automatically generated source-to-target mapping. Disable automatic mapping if you need to define custom patterns and destination tables.
Click Next to run the pre-check. Resolve any warnings about missing files or incompatible schemas.
Click Start Import to launch the job group.
Monitor the job statuses until they show Completed, then verify the imported data in TiDB Cloud.
Troubleshooting
- If the pre-check reports zero files, verify the S3 path and IAM permissions.
- If jobs remain in Preparing, ensure that the destination tables are empty and the required schema files exist.
- Use the Cancel action to stop a job group if you need to adjust mappings or credentials.
Next steps
- See Import Data into TiDB Cloud Premium using the MySQL Command-Line Client for scripted imports.
- See Troubleshoot Access Denied Errors during Data Import from Amazon S3 for IAM-related problems.