📣
TiDB Cloud Premium is now in public preview. Unlimited growth, instant elasticity, advanced security for enterprise workloads. Try it out →

Loading NDJSON into TiDB Cloud Lake



What is NDJSON?

NDJSON is built on top of JSON, and it is a strict subset of JSON. Each line must contain a separate, self-contained valid JSON object.

The following example shows a NDJSON file with two JSON objects:

{"title":"Title_0","author":"Author_0"} {"title":"Title_1","author":"Author_1"}

Loading NDJSON File

The common syntax for loading NDJSON file is as follows:

COPY INTO [<database>.]<table_name> FROM { userStage | internalStage | externalStage | externalLocation } [ PATTERN = '<regex_pattern>' ] [ FILE_FORMAT = ( TYPE = NDJSON, COMPRESSION = AUTO ) ]

Tutorial: Loading Data from NDJSON Files

Step 1. Create an Internal Stage

Create an internal stage to store the NDJSON files.

CREATE STAGE my_ndjson_stage;

Step 2. Create NDJSON files

Generate a NDJSON file using these SQL statements:

COPY INTO @my_ndjson_stage FROM ( SELECT 'Title_' || CAST(number AS VARCHAR) AS title, 'Author_' || CAST(number AS VARCHAR) AS author FROM numbers(100000) ) FILE_FORMAT = (TYPE = NDJSON) ;

Verify the creation of the NDJSON file:

LIST @my_ndjson_stage;

Result:

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ name │ size │ md5 │ last_modified │ creator │ ├────────────────────────────────────────────────────────────────┼─────────┼────────────────────────────────────┼───────────────────────────────┼──────────────────┤ │ data_b3d94fad-3052-42e4-b090-26409e88c7b9_0000_00000000.ndjson │ 4777780 │ "d1cc98fefc3e3aa0649cade880d754aa" │ 2023-12-26 12:15:59.000 +0000 │ NULL │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Step 3: Create Target Table

CREATE TABLE books ( title VARCHAR, author VARCHAR );

Step 4. Copying Directly from NDJSON

To directly copy data into your table from NDJSON files, use the following SQL command:

COPY INTO books FROM @my_ndjson_stage PATTERN = '.*[.]ndjson' FILE_FORMAT = ( TYPE = NDJSON );

Result:

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ File │ Rows_loaded │ Errors_seen │ First_error │ First_error_line │ ├────────────────────────────────────────────────────────────────┼─────────────┼─────────────┼──────────────────┼──────────────────┤ │ data_b3d94fad-3052-42e4-b090-26409e88c7b9_0000_00000000.ndjson │ 100000 │ 0 │ NULL │ NULL │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Step 4 (Option). Using SELECT to Copy Data

For more control, like transforming data while copying, use the SELECT statement. Learn more at SELECT from NDJSON.

COPY INTO books(title, author) FROM ( SELECT $1:title, $1:author FROM @my_ndjson_stage ) PATTERN = '.*[.]ndjson' FILE_FORMAT = ( TYPE = NDJSON );

Was this page helpful?