-
Notifications
You must be signed in to change notification settings - Fork 44
Open
Description
Snapshot with hard_deletes = 'new_record' should create a new record with flag dbt_is_deleted='True' when a record is deleted in the source. The new record is created, unfortunately, on each subsequent snapshot run it is created again and again.
In our daily jobs, all deleted records are created as new records every day and it is bloating the snapshot table.
How to reproduce
{% snapshot test_history %}
{{
config(
target_schema='dbo',
strategy='check',
check_cols = 'all',
unique_key='txt',
hard_deletes = 'new_record'
)
}}
SELECT 'A' as txt
UNION ALL
SELECT 'B' as txt
{% endsnapshot %}
Now run snapshot.
Remove the record B
{% snapshot test_history %}
{{
config(
target_schema='dbo',
strategy='check',
check_cols = 'all',
unique_key='txt',
hard_deletes = 'new_record'
)
}}
SELECT 'A' as txt
{% endsnapshot %}
Now run the snapshot twice and check result. Each time you run it, a record is created.
##solution
I think in helpers.sql macro fabric__snapshot_staging_table , the deleted records should be excluded in snapshotted_data CTE e.g. there should be
{%- if strategy.hard_deletes == 'new_record' %}
where {{ columns.dbt_is_deleted }} = 'False'
{% endif -%}
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels