你的位置：首页>programmer>hive - Spark Overwrite table , getting data loss when terminated at insertion stage - Stack Overflow

hive - Spark Overwrite table , getting data loss when terminated at insertion stage - Stack Overflow

programmeradmin2025-02-131浏览0评论

Objective :

we need to read the table in spark application and trasform the data and rewrite the same table

Senario :

I am trying to overwrite external non partitioned table with spark

Since same data read and write was not possible , we are using a concept of checkpoint.

We have observed that if any application got terminated during the insertion job in spark. the data in the original table is getting lost before inserting the revised data as application terminated in the middle. Due to this we are loosing entire data in table.

Understood that, spark will first delete the existing data and then will write the modified data we are inserting.

Is there workaround to prevent data loss or what appoach is best to read / trasform/ write the same table with spark.

与本文相关的文章

hive - Spark Overwrite table , getting data loss when terminated at insertion stage - Stack Overflow

评论列表(0)

暂无评论

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

与本文相关的文章

评论列表(0)