最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Filtering polars dataframe by row with boolean mask - Stack Overflow

programmeradmin1浏览0评论

Good Morning,

I'm trying to filter a polars dataframe by using a boolean mask for the rows, which is generated from conditions on an specific column using:

df = df[df['col'] == cond]

And it's giving me an error because that filter is meant for column filter:

expected xx values when selecting columns by boolean mask, got yy

Where xx is the total columns and yy is the count of True's in the mask result.

According to the polars api, that syntax should apply to filter to the rows (the same as how pandas work), but it's instead trying to apply it to the columns.

Is there any way to change this behaviour?

PS: Please don't advice to use .filter or .sql instead, that's not what i'm asking here.

Thanks in advance!

Good Morning,

I'm trying to filter a polars dataframe by using a boolean mask for the rows, which is generated from conditions on an specific column using:

df = df[df['col'] == cond]

And it's giving me an error because that filter is meant for column filter:

expected xx values when selecting columns by boolean mask, got yy

Where xx is the total columns and yy is the count of True's in the mask result.

According to the polars api, that syntax should apply to filter to the rows (the same as how pandas work), but it's instead trying to apply it to the columns.

Is there any way to change this behaviour?

PS: Please don't advice to use .filter or .sql instead, that's not what i'm asking here.

Thanks in advance!

Share Improve this question asked Mar 17 at 16:26 GhostGhost 1,5705 gold badges20 silver badges45 bronze badges 8
  • Where does the Polars API say this should apply to rows? – jqurious Commented Mar 17 at 16:31
  • rhosignal/posts/polars-boolean-filter And i've found a couple more examples in the last few days using the same filter syntax, not sure if outdated. – Ghost Commented Mar 17 at 17:39
  • Yeah, it only applied to rows in the early versions of Polars. github/pola-rs/polars/pull/4342 – jqurious Commented Mar 17 at 17:46
  • Ah damn.. so, no way to bool-mask the rows? I know i can bypass it by just "converting" them to row numbers, but any way to do it directly? – Ghost Commented Mar 17 at 18:51
  • 2 @BallpointBen it's not pandas – Dean MacGregor Commented Mar 17 at 23:46
 |  Show 3 more comments

1 Answer 1

Reset to default 2

Use .filter() instead.

# You can use it with masks just fine:
bool_mask = [True, False, ...]
df.filter(bool_mask)

# Although ideally you should use it with expressions
df.filter(pl.col('xyz') == ...)

# Even if you need to call a custom function
df.filter(pl.col('xyz').map_elements(foo))
发布评论

评论列表(0)

  1. 暂无评论