最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

dataframe - Extract Column values from data frame and pass into SQL pyspark where Clause - Stack Overflow

programmeradmin1浏览0评论

I am trying a scenario to extract data from backend into Data frame and just retrieve Column1 list values example "ID" column from that table and pass that list of ID values into SQL query for another data extraction. Tried the below line and it gives me array of response like the one i pasted here:

row_list = df.select('Column_header').collect()
  • Response:

    [Row(Column_header='Value1'), Row(Column_header='Value2')........]

What I would like to extract is like this:

[val1,val2,val3.....]

Tried with RDD and map, but still getting syntax errors even on using correct format. Need help here.

Tried RDD, flatmap etc no syntax works.

I am trying a scenario to extract data from backend into Data frame and just retrieve Column1 list values example "ID" column from that table and pass that list of ID values into SQL query for another data extraction. Tried the below line and it gives me array of response like the one i pasted here:

row_list = df.select('Column_header').collect()
  • Response:

    [Row(Column_header='Value1'), Row(Column_header='Value2')........]

What I would like to extract is like this:

[val1,val2,val3.....]

Tried with RDD and map, but still getting syntax errors even on using correct format. Need help here.

Tried RDD, flatmap etc no syntax works.

Share Improve this question edited Mar 17 at 9:36 Yash Mehta 2,0064 gold badges12 silver badges21 bronze badges asked Mar 17 at 9:09 Kirthi ShreeKirthi Shree 32 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1

There's an extra step missing:

row_list = df.select('Column_header').collect()
result = [row['Column_header'] for row in row_list]
发布评论

评论列表(0)

  1. 暂无评论