spark streaming - Delta live tables - cant update

Objective

I plan to use Delta Live Tables (DLT) to deliver near real-time reporting in Power BI.

Current Setup

I load Bronze Delta tables every 1 minute using Fivetran.
These Bronze tables serve as the source for creating Silver Delta tables.
Incoming data from Fivetran contains inserts, updates, and logical deletes:

Deletes are not physical; instead, a boolean column (True/False) indicates whether a row is deleted.

The most frequent updates occur in this deletion flag column.

Issue

When running my DLT pipeline, I receive the following error:

Detected an update in the source table at version 317. This is currently not supported.

Questions & Possible Solutions

Is there a workaround for this issue?

Can I use a MERGE INTO statement instead of APPLY CHANGES**?**

Alternative Approach:

Instead of using the Bronze Delta table directly, I could load files into a storage account using Fivetran.

Then, I could use Autoloader to read the files and create DLT Bronze tables on top of them.

Would this approach help in resolving the issue?

Objective

I plan to use Delta Live Tables (DLT) to deliver near real-time reporting in Power BI.

Current Setup

I load Bronze Delta tables every 1 minute using Fivetran.
These Bronze tables serve as the source for creating Silver Delta tables.
Incoming data from Fivetran contains inserts, updates, and logical deletes:

Deletes are not physical; instead, a boolean column (True/False) indicates whether a row is deleted.

The most frequent updates occur in this deletion flag column.

Issue

When running my DLT pipeline, I receive the following error:

Detected an update in the source table at version 317. This is currently not supported.

Questions & Possible Solutions

Is there a workaround for this issue?

Can I use a MERGE INTO statement instead of APPLY CHANGES**?**

Alternative Approach:

Instead of using the Bronze Delta table directly, I could load files into a storage account using Fivetran.

Then, I could use Autoloader to read the files and create DLT Bronze tables on top of them.

Would this approach help in resolving the issue?

Share Improve this question asked Feb 6 at 9:14 play_something_good 1392 silver badges12 bronze badges

Any workaround you tried and facing blockers? please add those details. – JayashankarGS Commented Feb 7 at 9:23
also exactly where you are getting error. in silver table or bronze? – JayashankarGS Commented Feb 7 at 9:26
1 i added the answer that works for me – play_something_good Commented Feb 7 at 9:28

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

@dlt.view
def vw_tms_activity_bronze():
    return (spark.readStream
            .option("readChangeFeed", "true")
            .table("lakehouse_poc.yms_oracle_tms.activity")
            
            .withColumnRenamed("_change_type", "_change_type_bronze")
            .withColumnRenamed("_commit_version", "_commit_version_bronze")
            .withColumnRenamed("_commit_timestamp", "_commit_timestamp_bronze"))
    

dlt.create_streaming_table("tg_tms_activity_silver")

dlt.apply_changes(
    target = "tg_tms_activity_silver",
    source = "vw_tms_activity_bronze",
    keys = ["activity_seq"],
    sequence_by = "_fivetran_synced",
    stored_as_scd_type  = 1
)

above worked for me. So you read the CDF and the apply changes to your delta live tables .

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

spark streaming - Delta live tables - cant update - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)