dataframe - Error in writing panda data frame to Delta Table using schema with non-nullable fields

I'm using Deltalake version -0.17.0. here are steps, we do-

Read in the DeltaTable from existing S3 location. dt = DeltaTable("s3://mylocation/")
Converted it to pyarrow table. arrow_table = dt.to_pyarrow_table()
Filtered the arrow table and selected specific columns of interest
Converted arrow table to pandas data frame. df = arrow_table.to_pandas()
Writing panda dataframe back to existing new delta table. Table is empty at this point and has schema defined with non-nullable fields.
write_deltalake("s3://test_sample_process/", df, mode="overwrite"). also tried it with schema_mode="overwrite"

Error we get is -

    raise ValueError(
ValueError: Schema of data does not match table schema

Data schema:
namespace: string
ki_record_name: string
wk_center: string
kt_config: string
kt_parameters: string
mi_updated_at: timestamp[us, tz=UTC]
mi_updated_by: string

Table Schema:

namespace: string
ki_record_name: string
wk_center: string not null
kt_config: string
kt_parameters: string
mi_updated_at: timestamp[us, tz=UTC] not null
  -- field metadata --
  comment: '"The time this record was updated"'
mi_updated_by: string not null
  -- field metadata --
  comment: '"The process that updated this record"'

Verified the data frame we are trying to write that it does NOT contains any null values. It has only 2 rows, so could easily do visual inspection. Also posted the same on delta table github, but did not receive any helpful suggestions. The delta table uses pyarrow engine by default in current version. The recommendation was to migrate off it. We could try that, but it should work in the current version that supports the pyarrow engine.The same code works, when drop the schema.At that point, Delta table creates schema with all nullable fields. I want to enforce/use non-nullable fields and not able to understand why this is failing.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

dataframe - Error in writing panda data frame to Delta Table using schema with non-nullable fields - Stack Overflow

与本文相关的文章

评论列表(0)