Skip to content

Master DateTime Functions in BigQuery

Mastering datetime processing in analysis is imperative for extracting meaningful insights from temporal data. It enables analysts to conduct temporal analysis, integrate datasets based on time intervals, and engineer features essential for predictive modeling. Additionally, datetime processing is vital for data preprocessing tasks such as cleaning, aligning timestamps, and ensuring data consistency.

It facilitates time-based queries, enabling analysts to filter and query data effectively, while also supporting the creation of informative visualizations for communicating insights. Ultimately, proficiency in datetime processing empowers analysts to make informed decisions based on historical trends and future projections, driving better outcomes across various domains through accurate predictions and deeper understanding of temporal relationships within the data.

I’d like to use the “google public dataset-london bike” dataset to show the examples.

london bike dataset
london bike dataset

Extract Date and Time


In analysis, direct utilization of the end_date and start_date is not employed. Rather, it is imperative to extract specific components of the date and time from them, such as the day, hour, and year, for effective analysis.

Extract Date: year, month and day

SELECT end_date,
  EXTRACT(YEAR FROM end_date) AS year,
  EXTRACT(MONTH FROM end_date) AS month,
  EXTRACT(DAY FROM end_date) AS day
FROM
  `bigquery-public-data.london_bicycles.cycle_hire`
  WHERE
  end_date IS NOT NULL
LIMIT 5;
Rowend_dateyearmonthday
12016-08-31 20:49:00 UTC2016831
22016-08-31 15:19:00 UTC2016831
32016-08-31 22:12:00 UTC2016831
42016-09-04 17:10:00 UTC201694
52016-09-02 14:19:00 UTC201692

Extract Time: hour, minute and second

SELECT end_date,
  EXTRACT(hour FROM end_date) AS hour,
  EXTRACT(minute FROM end_date) AS minute,
  EXTRACT(second FROM end_date) AS second
FROM
  `bigquery-public-data.london_bicycles.cycle_hire`
  WHERE
  end_date IS NOT NULL
LIMIT 5;
Rowend_datehourminutesecond
12016-08-31 20:49:00 UTC20490
22016-08-31 15:19:00 UTC15190
32016-08-31 22:12:00 UTC22120
42016-09-04 17:10:00 UTC17100
52016-09-02 14:19:00 UTC14190

Extract day_of_week, week_of_year

SELECT end_date,
  EXTRACT(DAYOFWEEK FROM end_date) AS day_of_week,
  EXTRACT(WEEK FROM end_date) AS week_of_year
FROM
  `bigquery-public-data.london_bicycles.cycle_hire`
  WHERE
  end_date IS NOT NULL
LIMIT 5;
Rowend_dateday_of_weekweek_of_year
12016-08-31 20:49:00 UTC435
22016-08-31 15:19:00 UTC435
32016-08-31 22:12:00 UTC435
42016-09-04 17:10:00 UTC136
52016-09-02 14:19:00 UTC635

Convert String to Datetime Format


To convert a string to a datetime format in BigQuery, you can use the FORMAT_TIMEPARSE_TIMESTAMP or CAST function, depending on the format of your string.

If you are not familiar with the datetime format element, the page is helpful.

FORMAT_TIME

SELECT FORMAT_TIME("%R", TIME "13:15:00") as formatted_time;
/*----------------*
 | formatted_time |
 +----------------+
 | 13:15          |
 *----------------*/

SELECT FORMAT_DATE("%b-%d-%Y", DATE "2024-01-25") AS formatted;
/*-------------*
 | formatted   |
 +-------------+
 | Jan-25-2024 |
 *-------------*/

SELECT FORMAT_DATETIME("%c", DATETIME "2024-01-25 15:30:00") AS formatted_datetime;

/*--------------------------*
 | formatted_datetime       |
 +--------------------------+
 | Thu Jan 25 15:30:00 2024 |
 *--------------------------*/

PARSE_TIME

If the string is in a standard format such as ‘YYYY-MM-DD HH:MM:SS’, we can use PARSE_TIMESTAMP;

If the string is NOT in a standard format, we can use CAST;

  • If we don’t specify a time component when using PARSE_TIMESTAMP in BigQuery, the default time will be set to ’00:00:00′ (midnight) for the resulting timestamp.
  • In Unix-like systems, including BigQuery, the epoch is conventionally set at midnight (00:00:00) on January 1, 1970, UTC, commonly known as the “Unix epoch.” In instances where a date isn’t specified in the input string, it defaults to interpreting the provided time as elapsed since the Unix epoch (January 1, 1970).
SELECT  PARSE_TIMESTAMP('%Y-%m-%d %H:%M:%S', '2024-02-25 10:30:00') AS parsed_datetime;
/*--------------------------*
 | parsed_datetime          |
 +--------------------------+
 | 2024-02-25 10:30:00 UTC  |
 *--------------------------*/

SELECT PARSE_TIMESTAMP('%Y-%m-%d' , '2024-02-25') AS parsed_datetime;
/*--------------------------*
 | parsed_datetime          |
 +--------------------------+
 | 2024-02-25 00:00:00 UTC  |
 *--------------------------*/

SELECT PARSE_TIMESTAMP('%H:%M:%S', '10:30:00') AS parsed_datetime;
/*--------------------------*
 | parsed_datetime          |
 +--------------------------+
 | 1970-01-01 10:30:00 UTC  |
 *--------------------------*/

CAST

CAST() is just like the PARSE_TIME()

SELECT CAST('2024-01-28' AS TIMESTAMP) AS date_time;
/*--------------------------*
 | date_time                |
 +--------------------------+
 | 2024-01-28 00:00:00 UTC  |
 *--------------------------*/

SELECT CAST('2024-01-28 12:00:05' AS TIMESTAMP) AS date_time;
/*--------------------------*
 | date_time                |
 +--------------------------+
 | 2024-01-28 12:00:05 UTC  |
 *--------------------------*/

TIME

The other way to build date time element is to use TIME() function. We can construct a datetime element by assigning value to each component.

1. TIME(hour, minute, second)
2. TIME(timestamp, [time_zone])
3. TIME(datetime)
SELECT
  TIME(09, 45, 00) as time_hms,   # assign component with int value
  TIME(TIMESTAMP "2020-11-11 10:11:00+08", "Europe/Berlin") as time_tstz  # construct with timestamp and timezone
  TIME(DATETIME "2022-1-15 10:11:00.000000") AS time_dt;                  # construct with datetime

/*----------+-----------+------------*
 | time_hms | time_tstz ||time_dt    |
 +----------+-----------+------------+
 | 09:45:00 | 03:11:00  |10:11:00    |
 *----------+-----------+------------*/

There are more other datetime related functions in BigQuery, please read the post