Good Morning,
I'm trying to filter a polars dataframe by using a boolean mask for the rows, which is generated from conditions on an specific column using:
df = df[df['col'] == cond]
And it's giving me an error because that filter is meant for column filter:
expected xx values when selecting columns by boolean mask, got yy
Where xx is the total columns and yy is the count of True's in the mask result.
According to the polars api, that syntax should apply to filter to the rows (the same as how pandas work), but it's instead trying to apply it to the columns.
Is there any way to change this behaviour?
PS: Please don't advice to use .filter or .sql instead, that's not what i'm asking here.
Thanks in advance!
Good Morning,
I'm trying to filter a polars dataframe by using a boolean mask for the rows, which is generated from conditions on an specific column using:
df = df[df['col'] == cond]
And it's giving me an error because that filter is meant for column filter:
expected xx values when selecting columns by boolean mask, got yy
Where xx is the total columns and yy is the count of True's in the mask result.
According to the polars api, that syntax should apply to filter to the rows (the same as how pandas work), but it's instead trying to apply it to the columns.
Is there any way to change this behaviour?
PS: Please don't advice to use .filter or .sql instead, that's not what i'm asking here.
Thanks in advance!
Share Improve this question asked Mar 17 at 16:26 GhostGhost 1,5705 gold badges20 silver badges45 bronze badges 8- Where does the Polars API say this should apply to rows? – jqurious Commented Mar 17 at 16:31
- rhosignal/posts/polars-boolean-filter And i've found a couple more examples in the last few days using the same filter syntax, not sure if outdated. – Ghost Commented Mar 17 at 17:39
- Yeah, it only applied to rows in the early versions of Polars. github/pola-rs/polars/pull/4342 – jqurious Commented Mar 17 at 17:46
- Ah damn.. so, no way to bool-mask the rows? I know i can bypass it by just "converting" them to row numbers, but any way to do it directly? – Ghost Commented Mar 17 at 18:51
- 2 @BallpointBen it's not pandas – Dean MacGregor Commented Mar 17 at 23:46
1 Answer
Reset to default 2Use .filter()
instead.
# You can use it with masks just fine:
bool_mask = [True, False, ...]
df.filter(bool_mask)
# Although ideally you should use it with expressions
df.filter(pl.col('xyz') == ...)
# Even if you need to call a custom function
df.filter(pl.col('xyz').map_elements(foo))