BigQuery - Generate Unique Values

event 2021-03-13 visibility 1,920 comment 0 insights
more_vert
insights Stats
Raymond Raymond Google Cloud Platform

Data analytics with Google Cloud Platform (GCP). For BigQuery SQL mentioned in the articles, they are all using standard SQL dialect unless specified differently. 

In SQL Server or other database systems, IDENTITY, GUID or other autoincrement features are presenting to support generating a unique value for each row. In BigQuery, the support is very limited. We can use UUID or numbering functions to implement similar ones.

GENERATE_UUID

This function returns a random UUID as STRING. It is similar as function NEWID in SQL Server. The returned string is in the form xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.

Example:

SELECT
  GENERATE_UUID() AS ID;

Sample output (in JSON format):

[
  {
    "ID": "d51e9e92-561e-4746-9b41-ed6c6df22b7d"
  }
]
warning When considering about performance, UUID is not the best option to be used as join keys for JOIN operations. 

ROW_NUMBER function

We can also use numbering functions like ROW_NUMBER to generate a sequential unique numbers when saving data to BigQuery tables.

Example:

SELECT
  *,
  ROW_NUMBER() OVER (ORDER BY CustomerID) AS ROW_NUM
FROM
  `test.dim_customer`;

Sample output (in JSON format):

[
  {
    "CustomerID": "77878d7f-4e4f-4633-b7c9-796229e06cd3",
    "CustomerNumber": "002",
    "Name": "Jason",
    "IsCurrent": true,
    "StartDate": "2021-02-01",
    "EndDate": "2021-03-12",
    "ROW_NUM": "1"
  },
  {
    "CustomerID": "cb68b654-8f29-4302-81a0-8929154a241c",
    "CustomerNumber": "003",
    "Name": "Celia",
    "IsCurrent": true,
    "StartDate": "2021-03-13",
    "EndDate": "9999-12-31",
    "ROW_NUM": "2"
  },
  {
    "CustomerID": "e42a8b6e-e558-43fe-b5c2-378e681652bf",
    "CustomerNumber": "001",
    "Name": "Raymond",
    "IsCurrent": false,
    "StartDate": "2021-01-01",
    "EndDate": "2021-03-12",
    "ROW_NUM": "3"
  },
  {
    "CustomerID": "e42a8b6e-e558-43fe-b5c2-378e681652bf",
    "CustomerNumber": "001",
    "Name": "Ray",
    "IsCurrent": true,
    "StartDate": "2021-03-13",
    "EndDate": "9999-12-31",
    "ROW_NUM": "4"
  }
]

References

UUID functions in Standard SQL  |  BigQuery  |  Google Cloud

Numbering functions in Standard SQL  |  BigQuery  |  Google Cloud

More from Kontext
comment Comments
No comments yet.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts