I am trying to get the indices of the missing date by comparing it to a list of un-missed dates, as the following:
a = pd.DatetimeIndex(["2000", "2001", "2002", "2003",
"2004", "2005", "2009", "2010"])
b = pd.DatetimeIndex(["2000", "2001", "2002", "2003",
"2004", "2005", "2006", "2007",
"2008", "2009", "2010"])
a.reindex(b)
I got the following
(DatetimeIndex(['2000-01-01', '2001-01-01', '2002-01-01', '2003-01-01',
'2004-01-01', '2005-01-01', '2006-01-01', '2007-01-01',
'2008-01-01', '2009-01-01', '2010-01-01'],
dtype='datetime64[ns]', freq=None),
array([ 0, 1, 2, 3, 4, 5, -1, -1, -1, 6, 7]))
I tried to replace all missing value which is -1 to Nan, by using a.reindex(b, fill_value=np.NAN)
but I got the following error TypeError: Index.reindex() got an unexpected keyword argument ‘fill_value’
According the pandas documentation fill_vaue is among the parameters of reindex. Any ideas?
I am trying to get the indices of the missing date by comparing it to a list of un-missed dates, as the following:
a = pd.DatetimeIndex(["2000", "2001", "2002", "2003",
"2004", "2005", "2009", "2010"])
b = pd.DatetimeIndex(["2000", "2001", "2002", "2003",
"2004", "2005", "2006", "2007",
"2008", "2009", "2010"])
a.reindex(b)
I got the following
(DatetimeIndex(['2000-01-01', '2001-01-01', '2002-01-01', '2003-01-01',
'2004-01-01', '2005-01-01', '2006-01-01', '2007-01-01',
'2008-01-01', '2009-01-01', '2010-01-01'],
dtype='datetime64[ns]', freq=None),
array([ 0, 1, 2, 3, 4, 5, -1, -1, -1, 6, 7]))
I tried to replace all missing value which is -1 to Nan, by using a.reindex(b, fill_value=np.NAN)
but I got the following error TypeError: Index.reindex() got an unexpected keyword argument ‘fill_value’
According the pandas documentation fill_vaue is among the parameters of reindex. Any ideas?
Share Improve this question edited Jan 18 at 19:49 wjandrea 32.9k9 gold badges69 silver badges96 bronze badges asked Jan 18 at 18:01 KernelKernel 7091 gold badge14 silver badges26 bronze badges 1- 2 It is because you do not have a DataFrame, but an Index object. Here are the relevant docs: pandas.pydata.org/docs/reference/api/pandas.Index.reindex.html. – Dr. V Commented Jan 18 at 18:23
2 Answers
Reset to default 1First of all, you have to do:
newIndex, indexer = a.reindex(b)
reindex
returns two things. You need/want to get only the indexer.
So now you can get what you want:
indexerWithNan = np.where(indexer == -1, np.nan, indexer)
Which is:
[ 0. 1. 2. 3. 4. 5. nan nan nan 6. 7.]
Why was your initial code wrong? The reindex()
method does not support the fill_value
parameter for pandas.Index
objects as it does for pandas.Series
or pandas.DataFrame
.
If you just want to know which values of b
are missing from a
use Index.isin
:
~b.isin(a)
Output:
array([False, False, False, False, False, False, True, True, True,
False, False])
If you want the missing values:
b[~b.isin(a)]
Output:
DatetimeIndex(['2006-01-01', '2007-01-01', '2008-01-01'], dtype='datetime64[ns]', freq=None)
and for the indices:
np.where(~b.isin(a))[0]
Output: array([6, 7, 8])