Skip to content

Transform Data with DML in BigQuery

Learning Data Manipulation Language (DML) in SQL is crucial for anyone working with databases, as it provides the ability to interact with and modify data effectively. DML commands such as INSERT, UPDATE, DELETE, and MERGE allow users to add new records, update existing ones, or remove unwanted data from tables.

Understanding DML is essential for maintaining data integrity and ensuring the accuracy and consistency of information within the database. Furthermore, DML proficiency is fundamental for data transformation and data cleansing tasks, enabling users to manipulate data to meet specific requirements or standards.

CREATE table

The CREATE OR REPLACE TABLE statement, though typically associated with DDL, merits inclusion here due to its close association with DML operations.

It guides the action: If the table tutorial.personnel does not exist, then create a new one; If the table exists, then update it. Then we need to declare the table’s Schema, which contains column name and data type.

CREATE OR REPLACE TABLE tutorial.personnel (
  name STRING,
  age INT64,
  salary FLOAT64,
  is_Male BOOLEAN
);
Field nameTypeMode
nameSTRINGNULLABLE
ageINTEGERNULLABLE
salaryFLOATNULLABLE
is_MaleBOOLEANNULLABLE

INSERT data

INSERT INTO tutorial.personnel(name, age, salary, is_male) VALUES
  ('John', 30, 50000.00, True),
  ('Alice', 25, 60000.00,False),
  ('Bob', 35, NULL,True),
  ('Emma', NULL, 70000.00,True),
  (NULL, 28, 55000.00,True),
  ('Michael', 40, NULL,False),
  ('Sara', 33, 65000.00,False),
  (NULL, NULL, NULL,NULL),
  ('David', 45, 75000.00,True),
  ('Sophia', 27, 58000.00,False);

#This statement added 10 rows to personnel.
Rownameagesalaryis_Male
1nullnullnullnull
2Sara3365000.0false
3Michael40nullfalse
4Sophia2758000.0false
5Alice2560000.0false
6John3050000.0true
7null2855000.0true
8Bob35nulltrue
9David4575000.0true
10Emmanull70000.0true

You also can combine the WITH clause with INSERT SELECT. For example

# create a new table called fruit
CREATE OR REPLACE TABLE tutorial.personnel (
  fruit_name STRING,
  quantity INT64
);

# insert data with WITH clause
INSERT tutorial.fruit (fruit_name, quantity)
WITH fruit_shop AS (
  SELECT ARRAY<STRUCT<fruit_name string, quantity integer>>
      [('#1 Apple', 500),
       ('#2 Ananas', 300),
       ('#3 Grape', 1000),
       ('#4 Banana', 800)
       ] col
)
SELECT fruit_name, quantity FROM fruit_shop, UNNEST(fruit_shop.col)
Rowfruit_namequantity
1#3 Grape1000
2#1 Apple500
3#4 Banana800
4#2 Ananas300

UPDATE table

We can use UPDATE to update data in a row, in column or the column itself based on specific conditions. That’s why UPDATE always combines with SET and WHERE.

Update data in column

Let’s review the personnel table. In the 3rd row, Michael is identified as a man, yet the ‘is_male’ column is marked as false. Conversely, in the 10th row, Emma is recognized as a woman, yet the ‘is_male’ column is set to true. We must correct these discrepancies.

UPDATE instruction-415216.tutorial.personnel    
SET is_male = True
WHERE name = 'Michael';

UPDATE instruction-415216.tutorial.personnel
SET is_male = FALSE
WHERE name = 'Emma';

# verify the result
SELECT name, is_male
FROM `instruction-415216.tutorial.personnel`
WHERE name IN ('Michael','Emma');
Rownameis_male
1Emmafalse
2Michaeltrue

Update data in row

We noticed that the first row has no values. We can update information in this row.

UPDATE `instruction-415216.tutorial.personnel`
SET name = 'Ivy', age = 25, salary = 60000, is_Male = False   
WHERE name is NULL and age is NULL;      # specify the conditions

# check the result 
SELECT * FROM `instruction-415216.tutorial.personnel`
WHERE name = 'Ivy'
Rownameagesalaryis_Male
1Ivy2560000.0false

MERGE table

MERGE statement is a DML statement that can combine INSERTUPDATE, and DELETE operations into a single statement and perform the operations atomically.

MERGE tutorial.personnel AS target
USING (
  SELECT 
    name,
    age,
    IFNULL(salary, 10000.00) AS salary,        # Use default salary of 10000 if salary is NULL
    is_male
  FROM tutorial.personnel
) AS source
ON target.name = source.name AND target.age IS NOT NULL
WHEN MATCHED THEN
  UPDATE SET
    salary = source.salary;
Rownameagesalaryis_Male
6Michael4010000.0true
7Bob3510000.0true

DELETE data from table

You can delete rows based on specific conditions with DELETE clause. You need to declare the condition, too.
1. For example, I want to delete rows that salary is under 20000.

DELETE `instruction-415216.tutorial.personnel`
WHERE salary < 20000
Rownameagesalaryis_Male
1Emmanull70000.0false
2Alice2560000.0false
3Ivy2560000.0false
4Sophia2758000.0false
5Sara3365000.0false
6null2855000.0true
7John3050000.0true
8David4575000.0true

2. What if I want to remove all rows containing missing value( null)?

DELETE `instruction-415216.tutorial.personnel`
WHERE name IS NULL OR salary IS NULL OR age IS NULL
Rownameagesalaryis_Male
1Ivy2560000.0false
2Alice2560000.0false
3Sophia2758000.0false
4Sara3365000.0false
5John3050000.0true
6David4575000.0true

3. Remove all rows

If you want to remove all rows in a table, you can use TRUNCATE TABLE .

The TRUNCATE TABLE statement removes all rows from a table but leaves the table metadata intact, including the table schema, description, and labels.

TRUNCATE TABLE  `instruction-415216.tutorial.personnel`
  • If you check the preview, it shows that there is no data to display.
  • If you check the Schema, it still remains.