Mastering datetime processing in analysis is imperative for extracting meaningful insights from temporal data. It enables analysts to conduct temporal analysis, integrate datasets based on time intervals, and engineer features essential for predictive modeling. Additionally, datetime processing is vital for data preprocessing tasks such as cleaning, aligning timestamps, and ensuring data consistency.
It facilitates time-based queries, enabling analysts to filter and query data effectively, while also supporting the creation of informative visualizations for communicating insights. Ultimately, proficiency in datetime processing empowers analysts to make informed decisions based on historical trends and future projections, driving better outcomes across various domains through accurate predictions and deeper understanding of temporal relationships within the data.
I’d like to use the “google public dataset-london bike” dataset to show the examples.
Extract Date and Time
In analysis, direct utilization of the end_date and start_date is not employed. Rather, it is imperative to extract specific components of the date and time from them, such as the day, hour, and year, for effective analysis.
Extract Date: year, month and day
SELECT end_date, EXTRACT(YEAR FROM end_date) AS year, EXTRACT(MONTH FROM end_date) AS month, EXTRACT(DAY FROM end_date) AS day FROM `bigquery-public-data.london_bicycles.cycle_hire` WHERE end_date IS NOT NULL LIMIT 5;
Row | end_date | year | month | day |
---|---|---|---|---|
1 | 2016-08-31 20:49:00 UTC | 2016 | 8 | 31 |
2 | 2016-08-31 15:19:00 UTC | 2016 | 8 | 31 |
3 | 2016-08-31 22:12:00 UTC | 2016 | 8 | 31 |
4 | 2016-09-04 17:10:00 UTC | 2016 | 9 | 4 |
5 | 2016-09-02 14:19:00 UTC | 2016 | 9 | 2 |
Extract Time: hour, minute and second
SELECT end_date, EXTRACT(hour FROM end_date) AS hour, EXTRACT(minute FROM end_date) AS minute, EXTRACT(second FROM end_date) AS second FROM `bigquery-public-data.london_bicycles.cycle_hire` WHERE end_date IS NOT NULL LIMIT 5;
Row | end_date | hour | minute | second |
---|---|---|---|---|
1 | 2016-08-31 20:49:00 UTC | 20 | 49 | 0 |
2 | 2016-08-31 15:19:00 UTC | 15 | 19 | 0 |
3 | 2016-08-31 22:12:00 UTC | 22 | 12 | 0 |
4 | 2016-09-04 17:10:00 UTC | 17 | 10 | 0 |
5 | 2016-09-02 14:19:00 UTC | 14 | 19 | 0 |
Extract day_of_week, week_of_year
SELECT end_date, EXTRACT(DAYOFWEEK FROM end_date) AS day_of_week, EXTRACT(WEEK FROM end_date) AS week_of_year FROM `bigquery-public-data.london_bicycles.cycle_hire` WHERE end_date IS NOT NULL LIMIT 5;
Row | end_date | day_of_week | week_of_year |
---|---|---|---|
1 | 2016-08-31 20:49:00 UTC | 4 | 35 |
2 | 2016-08-31 15:19:00 UTC | 4 | 35 |
3 | 2016-08-31 22:12:00 UTC | 4 | 35 |
4 | 2016-09-04 17:10:00 UTC | 1 | 36 |
5 | 2016-09-02 14:19:00 UTC | 6 | 35 |
Convert String to Datetime Format
To convert a string to a datetime format in BigQuery, you can use the FORMAT_TIME
,PARSE_TIMESTAMP
or CAST
function, depending on the format of your string.
If you are not familiar with the datetime format element, the page is helpful.
FORMAT_TIME
SELECT FORMAT_TIME("%R", TIME "13:15:00") as formatted_time; /*----------------* | formatted_time | +----------------+ | 13:15 | *----------------*/ SELECT FORMAT_DATE("%b-%d-%Y", DATE "2024-01-25") AS formatted; /*-------------* | formatted | +-------------+ | Jan-25-2024 | *-------------*/ SELECT FORMAT_DATETIME("%c", DATETIME "2024-01-25 15:30:00") AS formatted_datetime; /*--------------------------* | formatted_datetime | +--------------------------+ | Thu Jan 25 15:30:00 2024 | *--------------------------*/
PARSE_TIME
If the string is in a standard format such as ‘YYYY-MM-DD HH:MM:SS’, we can use PARSE_TIMESTAMP
;
If the string is NOT in a standard format, we can use CAST
;
- If we don’t specify a time component when using
PARSE_TIMESTAMP
in BigQuery, the default time will be set to ’00:00:00′ (midnight) for the resulting timestamp.
- In Unix-like systems, including BigQuery, the epoch is conventionally set at midnight (00:00:00) on January 1, 1970, UTC, commonly known as the “Unix epoch.” In instances where a date isn’t specified in the input string, it defaults to interpreting the provided time as elapsed since the Unix epoch (January 1, 1970).
SELECT PARSE_TIMESTAMP('%Y-%m-%d %H:%M:%S', '2024-02-25 10:30:00') AS parsed_datetime; /*--------------------------* | parsed_datetime | +--------------------------+ | 2024-02-25 10:30:00 UTC | *--------------------------*/ SELECT PARSE_TIMESTAMP('%Y-%m-%d' , '2024-02-25') AS parsed_datetime; /*--------------------------* | parsed_datetime | +--------------------------+ | 2024-02-25 00:00:00 UTC | *--------------------------*/ SELECT PARSE_TIMESTAMP('%H:%M:%S', '10:30:00') AS parsed_datetime; /*--------------------------* | parsed_datetime | +--------------------------+ | 1970-01-01 10:30:00 UTC | *--------------------------*/
CAST
CAST()
is just like the PARSE_TIME()
SELECT CAST('2024-01-28' AS TIMESTAMP) AS date_time; /*--------------------------* | date_time | +--------------------------+ | 2024-01-28 00:00:00 UTC | *--------------------------*/ SELECT CAST('2024-01-28 12:00:05' AS TIMESTAMP) AS date_time; /*--------------------------* | date_time | +--------------------------+ | 2024-01-28 12:00:05 UTC | *--------------------------*/
TIME
The other way to build date time element is to use TIME() function. We can construct a datetime element by assigning value to each component.
1. TIME(hour, minute, second) 2. TIME(timestamp, [time_zone]) 3. TIME(datetime)
SELECT TIME(09, 45, 00) as time_hms, # assign component with int value TIME(TIMESTAMP "2020-11-11 10:11:00+08", "Europe/Berlin") as time_tstz # construct with timestamp and timezone TIME(DATETIME "2022-1-15 10:11:00.000000") AS time_dt; # construct with datetime /*----------+-----------+------------* | time_hms | time_tstz ||time_dt | +----------+-----------+------------+ | 09:45:00 | 03:11:00 |10:11:00 | *----------+-----------+------------*/
There are more other datetime related functions in BigQuery, please read the post