Skip to content

Snapshot timestamp columns misaligned #221

@dberryunum

Description

@dberryunum

Describe the bug

In macro dbt/include/teradata/macros/materializations/snapshot/helpers.sql, there is a column alignment issue that could result in incorrect snapshot data.

This is the SQL that pulls snapshot updates (similar also for inserts):

 updates_source_data as (

        select
            snapshot_query.*,
            {{ unique_key_fields(strategy.unique_key) }},
            {{ strategy.updated_at }} as {{ columns.dbt_updated_at }},
            {{ strategy.updated_at }} as {{ columns.dbt_valid_from }},
            {{ strategy.updated_at }} as {{ columns.dbt_valid_to }}

        from snapshot_query

Note the order of the dbt_updated_at and dbt_updated_valid_from columns.

This is the SQL that pulls in deletes:

deletes as (

        select
            'delete' as dbt_change_type,
            source_data.*,
            {{ snapshot_get_time() }} as {{ columns.dbt_valid_from }},
            {{ snapshot_get_time() }} as {{ columns.dbt_updated_at }},
            {{ snapshot_get_time() }} as {{ columns.dbt_valid_to }},
            snapshotted_data.{{ columns.dbt_scd_id }}
            {%- if strategy.hard_deletes == 'new_record' -%}
            , snapshotted_data.{{ columns.dbt_is_deleted }}
          {%- endif %}
        from snapshotted_data
        left join deletes_source_data as source_data
            on {{ unique_key_join_on(strategy.unique_key, "snapshotted_data", "source_data") }}
            where {{ unique_key_is_null(strategy.unique_key, "source_data") }}
    )

Notice that here the order of the dbt_updated_at and dbt_updated_valid_from is swapped.

Ultimately, when these datasets get combined together here, the columns are misaligned, causing a silent logical error:

select * from insertions
    union all
    select * from updates
    {%- if strategy.hard_deletes == 'invalidate' or strategy.hard_deletes == 'new_record' %}
    union all
    select * from deletes
    {%- endif %}
    {%- if strategy.hard_deletes == 'new_record' %}
    union all
    select * from deletion_records
    {%- endif %}

Steps To Reproduce

Create a snapshot with the following config option:

hard_deletes: new_record

Then run a build or snapshot command.

If you trace through the compiled SQL generated in the log, you will see that columns are out of order.

Expected behavior

All columns in the datasets that are being combined with the union all in helpers.sql should be aligned, such that all every source column value is populated accurately in the target.

Screenshots and log output

There is no log message related to this as it is a silent error, but if you trace the compiled SQL in the log you can see the issue.

The output of dbt --version:

Core:
  - installed: 1.9.1 
  - latest:    1.10.7 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - snowflake: 1.9.2 - Update available!
  - teradata:  1.9.2 - Update available!

The operating system you're using:

Windows 11

The output of python --version:

Python 3.12.6

Additional context

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions