amazon s3 - Unable to sink Hudi table to S3 in Flink

I'm trying to use Flink-cdc to capture data change from Mysql and update the Hudi table in S3. My pyFlink job was like:

env = StreamExecutionEnvironment.get_execution_environment(config)
env.set_parallelism(1)
settings = EnvironmentSettings.new_instance().in_streaming_mode().build()
t_env = StreamTableEnvironment.create(env, environment_settings=settings)

t_env.execute_sql(f"""
            CREATE TABLE source (
                ...
            ) WITH (
                'connector' = 'mysql-cdc',
                ...
            )
        """)
t_env.execute_sql(f"""
            CREATE TABLE target (
                ...
            ) WITH (
                'connector' = 'hudi',
                'path' = 's3a://xxx/xx/xx',
                'table.type' = 'COPY_ON_WRITE',
                'write.precombine.field' = 'id',
                'write.operation' = 'upsert',
                'hoodie.datasource.write.recordkey.field' = 'id',
                'hoodie.datasource.write.partitionpath.field' = '',
                'write.tasks' = '1',
                'compaction.tasks' = '1',
                'compaction.async.enabled' = 'true',
                'hoodie.table.version' = 'SIX',
                'hoodie.write.table.version' = '6'
            )
        """)
t_env.execute_sql(f"""
            INSERT INTO target
            SELECT * FROM source
        """)

But when I try to submit this job to Flink, Flink return an error:

.apache.hudi.exception.HoodieLockException: Unsupported scheme :s3a, since this fs can not support atomic creation

What does this error mean? Does it mean S3 is not support to be the sink in this situation? Flink could not upsert data into S3?

I'm trying to use Flink-cdc to capture data change from Mysql and update the Hudi table in S3. My pyFlink job was like:

env = StreamExecutionEnvironment.get_execution_environment(config)
env.set_parallelism(1)
settings = EnvironmentSettings.new_instance().in_streaming_mode().build()
t_env = StreamTableEnvironment.create(env, environment_settings=settings)

t_env.execute_sql(f"""
            CREATE TABLE source (
                ...
            ) WITH (
                'connector' = 'mysql-cdc',
                ...
            )
        """)
t_env.execute_sql(f"""
            CREATE TABLE target (
                ...
            ) WITH (
                'connector' = 'hudi',
                'path' = 's3a://xxx/xx/xx',
                'table.type' = 'COPY_ON_WRITE',
                'write.precombine.field' = 'id',
                'write.operation' = 'upsert',
                'hoodie.datasource.write.recordkey.field' = 'id',
                'hoodie.datasource.write.partitionpath.field' = '',
                'write.tasks' = '1',
                'compaction.tasks' = '1',
                'compaction.async.enabled' = 'true',
                'hoodie.table.version' = 'SIX',
                'hoodie.write.table.version' = '6'
            )
        """)
t_env.execute_sql(f"""
            INSERT INTO target
            SELECT * FROM source
        """)

But when I try to submit this job to Flink, Flink return an error:

.apache.hudi.exception.HoodieLockException: Unsupported scheme :s3a, since this fs can not support atomic creation

What does this error mean? Does it mean S3 is not support to be the sink in this situation? Flink could not upsert data into S3?

Share Improve this question edited Mar 19 at 10:14 asked Mar 17 at 10:59 Rinze 8141 gold badge10 silver badges26 bronze badges

I'm not familiar with Flink, but did you also try s3: instead of s3a:? – John Rotenstein Commented Mar 17 at 11:55
@JohnRotenstein Yes I've tried s3:// instead of s3a://, but this will raise another error said that s3:// is unknown scheme – Rinze Commented Mar 19 at 0:52

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

Resolve by setting 'hoodie.write.lock.provider' = '.apache.hudi.client.transaction.lock.InProcessLockProvider'

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

amazon s3 - Unable to sink Hudi table to S3 in Flink - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)