Skip to content

How to compare two tables in BigQuery?

While we are doing the data health check before starting any analysis, it is very often to verify whether the two tables are identical. If not, we need to find out the difference and perhaps explain why later.

So, when we compare two tables, what are we exactly looking for?

  1. The records in table1  and not in table2
  2. The records in table2  and not in table1 

This sounds very simple, right? If the records both exist in table1 and table2, then they are identical.

Code

Please notice, table1 and table2 must with the same width and columns in the same order and type.

(
  SELECT * FROM table1
  EXCEPT DISTINCT 
  SELECT * from table2
)

UNION ALL

(
  SELECT * FROM table2
  EXCEPT DISTINCT
  SELECT * from table1
)