最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

pandas - Pyspark - Can not merge type <class 'pyspark.sql.types.StringType'> and <class &am

programmeradmin7浏览0评论

I'm converting pandas dataframe to spark dataframe, but it is failing with

Can not merge type <class 'pyspark.sql.types.StringType'> and <class 'pyspark.sql.types.DoubleType'>

I can infer the schema and convert the types. But I have array type and I don't want to infer array type. Is there a way to infer particular column (Id) alone to double and remain other columns untouched.

 |-- Id: string (nullable = true)
 |-- Field: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- key: string (nullable = true)
 |    |    |-- value: string (nullable = true)

I'm converting pandas dataframe to spark dataframe, but it is failing with

Can not merge type <class 'pyspark.sql.types.StringType'> and <class 'pyspark.sql.types.DoubleType'>

I can infer the schema and convert the types. But I have array type and I don't want to infer array type. Is there a way to infer particular column (Id) alone to double and remain other columns untouched.

 |-- Id: string (nullable = true)
 |-- Field: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- key: string (nullable = true)
 |    |    |-- value: string (nullable = true)
Share Improve this question edited Mar 9 at 8:36 Jim Macaulay asked Mar 9 at 8:22 Jim MacaulayJim Macaulay 5,1995 gold badges32 silver badges59 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 0

Defining the type to ArrayType(MapType(StringType(), StringType())) resolved the issue

schema = StructType([
    StructField('Id', StringType(), True), \
    StructField('Field', ArrayType(MapType(StringType(), StringType())), True))]

Is there a way to infer particular column (Id) alone to double and remain other columns

You have to use DoubleType if you want Double like below

remaining will be same as you have done.

schema = StructType([
    StructField('Id', DoubleType(), True), \
    StructField('Field', ArrayType(MapType(StringType(), StringType())), True))]
发布评论

评论列表(0)

  1. 暂无评论