📣
TiDB Cloud Premium is now in public preview. Unlimited growth, instant elasticity, advanced security for enterprise workloads. Try it out →

JARO_WINKLER



Calculates the Jaro-Winkler distance between two strings. It is commonly used for measuring the similarity between strings, with values ranging from 0.0 (completely dissimilar) to 1.0 (identical strings).

Syntax

JARO_WINKLER(<string1>, <string2>)

Return Type

The JARO_WINKLER function returns a FLOAT64 value representing the similarity between the two input strings. The return value follows these rules:

  • Similarity Range: The result ranges from 0.0 (completely dissimilar) to 1.0 (identical).

    SELECT JARO_WINKLER('datalake', 'Datalake') AS similarity; ┌────────────────────┐ │ similarity │ ├────────────────────┤ │ 0.9166666666666666 │ └────────────────────┘ SELECT JARO_WINKLER('datalake', 'database') AS similarity; ┌────────────┐ │ similarity │ ├────────────┤ │ 0.9 │ └────────────┘
  • NULL Handling: If either string1 or string2 is NULL, the result is NULL.

    SELECT JARO_WINKLER('datalake', NULL) AS similarity; ┌────────────┐ │ similarity │ ├────────────┤ │ NULL │ └────────────┘
  • Empty Strings:

    • Comparing two empty strings returns 1.0.

      SELECT JARO_WINKLER('', '') AS similarity; ┌────────────┐ │ similarity │ ├────────────┤ │ 1 │ └────────────┘
    • Comparing an empty string with a non-empty string returns 0.0.

      SELECT JARO_WINKLER('datalake', '') AS similarity; ┌────────────┐ │ similarity │ ├────────────┤ │ 0 │ └────────────┘

Was this page helpful?