Tips and Tricks #218: Implement Idempotent ETL with Merge Statements

Published on December 9, 2025

Use MERGE (upsert) for safe, rerunnable data pipelines that handle duplicates gracefully.

Code Snippet

-- PostgreSQL: INSERT ON CONFLICT (upsert)
INSERT INTO dim_customer (customer_id, name, email, updated_at)
SELECT customer_id, name, email, NOW()
FROM staging_customers
ON CONFLICT (customer_id) 
DO UPDATE SET
    name = EXCLUDED.name,
    email = EXCLUDED.email,
    updated_at = NOW()
WHERE dim_customer.name != EXCLUDED.name
   OR dim_customer.email != EXCLUDED.email;

-- SQL Server / Snowflake: MERGE
MERGE INTO dim_customer AS target
USING staging_customers AS source
ON target.customer_id = source.customer_id
WHEN MATCHED AND (target.name != source.name OR target.email != source.email) THEN
    UPDATE SET name = source.name, email = source.email, updated_at = GETDATE()
WHEN NOT MATCHED THEN
    INSERT (customer_id, name, email, updated_at)
    VALUES (source.customer_id, source.name, source.email, GETDATE());

Why This Helps

Safe to rerun without duplicates
Single statement instead of delete+insert
Tracks only actual changes

How to Test

Run pipeline twice, verify no duplicates
Check row counts and updated_at timestamps

When to Use

Dimension table loads, CDC processing, any ETL that may be rerun.

Performance/Security Notes

Always include updated_at for auditing. Consider SCD Type 2 for historical tracking.

References

https://docs.snowflake.com/en/sql-reference/sql/merge

Try this tip in your next project and share your results in the comments!

Discover more from Byte Architect

Subscribe to get the latest posts sent to your email.

Previous

Tips and Tricks #219: Partition Large Tables for Query Performance

Next

Orchestrating Chaos: Why AWS Step Functions Became My Secret Weapon for Building Resilient Distributed Systems

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Searching in

Enter search term to find items

to navigate, to select, and to close