numpy - Python environment error, "TypeError: Choicelist and default value do not have a common dtype"

I have two python environments - one online through a class, and the other on my own computer.

The following code works on the online environment, but gives an error on my local environment. Does anyone know what this error means, and have any suggestions for fixing my environment, or fixing my code? The online environment has a habit of losing my work, so I'd like to get this working on my own machine.

This is the code

    custom_categories = ['cat_a', 'cat_b', 'cat_c', 'Other']
    custom_categories_filter = [
        (df['column_name'].str.contains('(A)', regex = False)),
        (df['column_name'].str.contains('(B)', regex = False)),
        (df['column_name'].str.contains('(C)', regex = False)),
        (df['column_name'].str.contains('(A)', regex = False) == False) 
        & (df['column_name'].str.contains('(B)', regex = False) == False) 
        & (df['column_name'].str.contains('(C)', regex = False) == False)
    ]
    df["custom_category"] = numpy.select(custom_categories_filter, custom_categories)

It's intended to look through a column of a pandas data frame, search for certain terms in brackets, then put a value based on that term into a new column.

This is the error:

TypeError: Choicelist and default value do not have a common dtype: 
The DType <class 'numpy.dtypes._PyLongDType'> could not be promoted by <class 'numpy.dtypes.StrDType'>. 
This means that no common DType exists for the given inputs. 
For example they cannot be stored in a single array unless the dtype is `object`. 
The full list of DTypes is: (<class 'numpy.dtypes.StrDType'>, <class 'numpy.dtypes.StrDType'>, <class 'numpy.dtypes.StrDType'>, <class 'numpy.dtypes.StrDType'>, <class 'numpy.dtypes.StrDType'>, <class 'numpy.dtypes._PyLongDType'>)

Online environment is Python 3.11.11, local is 3.10.8 - could it be the python version I'm using?

I have two python environments - one online through a class, and the other on my own computer.

This is the code

    custom_categories = ['cat_a', 'cat_b', 'cat_c', 'Other']
    custom_categories_filter = [
        (df['column_name'].str.contains('(A)', regex = False)),
        (df['column_name'].str.contains('(B)', regex = False)),
        (df['column_name'].str.contains('(C)', regex = False)),
        (df['column_name'].str.contains('(A)', regex = False) == False) 
        & (df['column_name'].str.contains('(B)', regex = False) == False) 
        & (df['column_name'].str.contains('(C)', regex = False) == False)
    ]
    df["custom_category"] = numpy.select(custom_categories_filter, custom_categories)

It's intended to look through a column of a pandas data frame, search for certain terms in brackets, then put a value based on that term into a new column.

This is the error:

TypeError: Choicelist and default value do not have a common dtype: 
The DType <class 'numpy.dtypes._PyLongDType'> could not be promoted by <class 'numpy.dtypes.StrDType'>. 
This means that no common DType exists for the given inputs. 
For example they cannot be stored in a single array unless the dtype is `object`. 
The full list of DTypes is: (<class 'numpy.dtypes.StrDType'>, <class 'numpy.dtypes.StrDType'>, <class 'numpy.dtypes.StrDType'>, <class 'numpy.dtypes.StrDType'>, <class 'numpy.dtypes.StrDType'>, <class 'numpy.dtypes._PyLongDType'>)

Online environment is Python 3.11.11, local is 3.10.8 - could it be the python version I'm using?

Share asked Mar 14 at 14:46 Sophia 5,82511 gold badges40 silver badges43 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 1

The signature of np.select is as follows:

numpy.select(condlist, choicelist, default=0)

That is, it defaults to the integer 0. Your choicelist (custom_categories) contains strings, and in numpy >= 2.0, the library enforces stricter dtype rules following NEP 50.

To fix this, you can set default=''. However, it's better to use 'Other' and adjust your logic to match the intended use of the function:

import pandas as pd
import numpy as np

df = pd.DataFrame(data=['A(A)', 'A(B)', 'A(C)', 'A(D)'], 
                  columns=['column_name'])

custom_categories = ['cat_a', 'cat_b', 'cat_c']

custom_categories_filter = [
    df['column_name'].str.contains('(A)', regex=False),
    df['column_name'].str.contains('(B)', regex=False),
    df['column_name'].str.contains('(C)', regex=False)
    ]

df['custom_category'] = np.select(condlist=custom_categories_filter, 
                                  choicelist=custom_categories,
                                  default='Other')

Output:

  column_name custom_category
0        A(A)           cat_a
1        A(B)           cat_b
2        A(C)           cat_c
3        A(D)           Other

Example previous behaviour (numpy <= 1.26):

df['custom_category'] = np.select(condlist=custom_categories_filter, 
                                  choicelist=custom_categories)

Output:

  column_name custom_category
0        A(A)           cat_a
1        A(B)           cat_b
2        A(C)           cat_c
3        A(D)               0 # '0' cast as string!

In numpy >= 2.0 such coercion of default=0 to a string is no longer allowed.

For specifics on the change, compare 1.26 and 2.0 implementations of np.select.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

numpy - Python environment error, "TypeError: Choicelist and default value do not have a common dtype" - Stack

1 Answer 1

与本文相关的文章

评论列表(0)