I am trying to use Prophet to forecast Lululemon's stock prices. However, I am encountering the following error when fitting the model:
TypeError Traceback (most recent call last)
Cell In[3], line 17
15 # Fit the data to a Prophet model
16 model = Prophet()
---> 17 model.fit(lululemon_data)
19 # Create a dataframe to hold predictions for the next 5 years
20 future = model.make_future_dataframe(periods=5*365)
TypeError: arg must be a list, tuple, 1-d array, or Series
Here is my code:
python
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt
from prophet import Prophet
# Download Lululemon stock data
ticker = 'LULU'
lululemon_data = yf.download(ticker, start='2007-07-27') # Lululemon IPO date
# Prepare the data for Prophet
lululemon_data.reset_index(inplace=True)
lululemon_data = lululemon_data[['Date', 'Close']]
lululemon_data.rename(columns={'Date': 'ds', 'Close': 'y'}, inplace=True)
# Fit the data to a Prophet model
model = Prophet()
model.fit(lululemon_data)
# Create a dataframe to hold predictions for the next 5 years
future = model.make_future_dataframe(periods=5*365)
# Make predictions
forecast = model.predict(future)
# Plot the forecast
fig = model.plot(forecast)
plt.title('Lululemon Stock Price Forecast for Next 5 Years')
plt.xlabel('Date')
plt.ylabel('Closing Price (USD)')
plt.show()
I suspect the issue lies in the structure of the lululemon_data DataFrame because the code works correctly when I use the example data provided by Prophet:
python
df = pd.read_csv('.csv')
df.head()
I've tried to ensure the column names are correctly renamed to ds and y.
I am trying to use Prophet to forecast Lululemon's stock prices. However, I am encountering the following error when fitting the model:
TypeError Traceback (most recent call last)
Cell In[3], line 17
15 # Fit the data to a Prophet model
16 model = Prophet()
---> 17 model.fit(lululemon_data)
19 # Create a dataframe to hold predictions for the next 5 years
20 future = model.make_future_dataframe(periods=5*365)
TypeError: arg must be a list, tuple, 1-d array, or Series
Here is my code:
python
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt
from prophet import Prophet
# Download Lululemon stock data
ticker = 'LULU'
lululemon_data = yf.download(ticker, start='2007-07-27') # Lululemon IPO date
# Prepare the data for Prophet
lululemon_data.reset_index(inplace=True)
lululemon_data = lululemon_data[['Date', 'Close']]
lululemon_data.rename(columns={'Date': 'ds', 'Close': 'y'}, inplace=True)
# Fit the data to a Prophet model
model = Prophet()
model.fit(lululemon_data)
# Create a dataframe to hold predictions for the next 5 years
future = model.make_future_dataframe(periods=5*365)
# Make predictions
forecast = model.predict(future)
# Plot the forecast
fig = model.plot(forecast)
plt.title('Lululemon Stock Price Forecast for Next 5 Years')
plt.xlabel('Date')
plt.ylabel('Closing Price (USD)')
plt.show()
I suspect the issue lies in the structure of the lululemon_data DataFrame because the code works correctly when I use the example data provided by Prophet:
python
df = pd.read_csv('https://raw.githubusercontent.com/facebook/prophet/main/examples/example_wp_log_peyton_manning.csv')
df.head()
I've tried to ensure the column names are correctly renamed to ds and y.
Share Improve this question edited Jan 19 at 16:52 Lorena Schafer asked Jan 19 at 16:49 Lorena SchaferLorena Schafer 11 bronze badge1 Answer
Reset to default 0Unexpert opinion here after fiddling around with your code and chatgpt.
For your particular setup, there is some difference in the .columns
between the data frames.
>>> df.columns
Index(['ds', 'y'], dtype='object')
>>> ld.columns # lululemon_data
MultiIndex([('ds', ''),
( 'y', 'LULU')],
names=['Price', 'Ticker'])
I am not sure what the exact difference is, but it's probably used by yfinance
to differentiate between multiple tickers, and is confusing Prophet.
It is possible to disable the multi level index for yfinance.download
. From the documentation for yfinance.download
,
multi_level_index: bool
Optional. Always return a MultiIndex DataFrame? Default is True
I toggled multi_level_index
to False
, which seems to fix your code.
lululemon_data = yf.download(ticker, start='2007-07-27', multi_level_index=False) # Lululemon IPO date
For other cases where getting the underlying library cannot output an applicable type, maybe this thread can help to flatten a MultiIndex
.