How to use BigQuery to Extract JSON Array

2023/08/052 min read
bookmark this
Responsive image

You can interact with JSON data with BigQuery, this blog show show to parse the JSON value and load fields with BigQuery.

Sample JSON

This will be the JSON file using at the blog.

{
   "id":"1",
   "first_name":"John",
   "last_name":"Doe",
   "dob":"1968-01-22",
   "addresses":[
      {
         "status":"current",
         "address":"123 First Avenue",
         "city":"Seattle",
         "state":"WA",
         "zip":"11111",
         "numberOfYears":"1"
      },
      {
         "status":"previous",
         "address":"456 Main Street",
         "city":"Portland",
         "state":"OR",
         "zip":"22222",
         "numberOfYears":"5"
      }
   ]
}

Parse the single fields

At very first, assume has the variable contains the JSON string.

declare jsonstring default '{"id":"1","first_name":"John","last_name":"Doe","dob":"1968-01-22","addresses":[{"status":"current","address":"123 First Avenue","city":"Seattle","state":"WA","zip":"11111","numberOfYears":"1"},{"status":"previous","address":"456 Main Street","city":"Portland","state":"OR","zip":"22222","numberOfYears":"5"}]}';

To load just single fields from the JSON, we can do something as below, with $ sign.

select
JSON_VALUE(jsonstring, '$.id') as id,
JSON_VALUE(jsonstring, '$.first_name') as first_name,
JSON_VALUE(jsonstring, '$.last_name') as last_name,
id | first_name | last_name
1 | John | Doe

Include the JSON Array into the SQL

For the above SQL, if we try this with array, you can see the result has the whole JSON file.

select
JSON_VALUE(jsonstring, '$.id') as id,
JSON_VALUE(jsonstring, '$.first_name') as first_name,
JSON_VALUE(jsonstring, '$.last_name') as last_name,
json_extract_array(jsonstring, '$.addresses') as addresses

result with json array

Display Scalar value from JSON

above script is able to show the JSON, but what if we only need the city?

select
JSON_VALUE(jsonstring, '$.id') as id,
JSON_VALUE(jsonstring, '$.first_name') as first_name,
JSON_VALUE(jsonstring, '$.last_name') as last_name,
array_agg(y)
from
  unnest(
  array(
      select
        json_extract_scalar(x, '$.city')
      from unnest(json_extract_array(jsonstring, "$.addresses")) x
    )
  ) y

result with json and city field

Keep the filter fields at each row

move of the case, above result meet the result, but what if we want to show id, first_name and last_name at each row?

select
JSON_VALUE(jsonstring, '$.id') as id,
JSON_VALUE(jsonstring, '$.first_name') as first_name,
JSON_VALUE(jsonstring, '$.last_name') as last_name,
y
from
  unnest(
    array(
    select
    json_extract_scalar(x, '$.city')
    from unnest(json_extract_array(jsonstring, "$.addresses")) x
    )
  ) y

result with json and city field at each row

Conclusion

Above are list of example how to use interact with JSON and Big Query.