I'm training a LSTM for time series prediction, where data comes from sensors at irregular intervals. I'm using the last 5 min data to predict the next value, but some sequences are larger than others.
My input array's shape is (611,1200,15) where (sample, timesteps, features). The second dimension is not completed for every sample, so i padded the missing data with np.nan values. For instance, sample (1,:,:) has 1000 timesteps and 200 np.nan.
While training, loss equals nan.
What am i doing wrong? How can I train it?
Here's my attempt to train the LSTM:
def lstmFit(y, X, n_hidden=1, n_neurons=30, learning_rate=1e-2):
lstm = Sequential()
lstm.add(Masking(mask_value=np.nan, input_shape=(None, X.shape[2])))
for layer in range(n_hidden):
lstm.add(LSTM(n_neurons,
activation="tanh",
recurrent_activation = "sigmoid",
return_sequences=True))
lstm.add(Dense(1))
lstmpile(loss="mse", optimizer="adam")
early_stopping = EarlyStopping(monitor='loss', patience=10, verbose=1, restore_best_weights=True)
lstm.fit(X, y.reshape(-1), epochs=100, callbacks=[early_stopping])
y_train_fit = lstm.predict(X)
return lstm, y_train_fit
The model's summary:
lstm.summary()
Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
masking_7 (Masking) (None, None, 15) 0
lstm_6 (LSTM) (None, None, 30) 5520
dense_10 (Dense) (None, None, 1) 31
=================================================================
Total params: 5551 (21.68 KB)
Trainable params: 5551 (21.68 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
And the first epochs of training:
Epoch 1/100
18/18 [==============================] - 20s 335ms/step - loss: nan
Epoch 2/100
18/18 [==============================] - 6s 335ms/step - loss: nan
Epoch 3/100
18/18 [==============================] - 7s 365ms/step - loss: nan
I'm training a LSTM for time series prediction, where data comes from sensors at irregular intervals. I'm using the last 5 min data to predict the next value, but some sequences are larger than others.
My input array's shape is (611,1200,15) where (sample, timesteps, features). The second dimension is not completed for every sample, so i padded the missing data with np.nan values. For instance, sample (1,:,:) has 1000 timesteps and 200 np.nan.
While training, loss equals nan.
What am i doing wrong? How can I train it?
Here's my attempt to train the LSTM:
def lstmFit(y, X, n_hidden=1, n_neurons=30, learning_rate=1e-2):
lstm = Sequential()
lstm.add(Masking(mask_value=np.nan, input_shape=(None, X.shape[2])))
for layer in range(n_hidden):
lstm.add(LSTM(n_neurons,
activation="tanh",
recurrent_activation = "sigmoid",
return_sequences=True))
lstm.add(Dense(1))
lstmpile(loss="mse", optimizer="adam")
early_stopping = EarlyStopping(monitor='loss', patience=10, verbose=1, restore_best_weights=True)
lstm.fit(X, y.reshape(-1), epochs=100, callbacks=[early_stopping])
y_train_fit = lstm.predict(X)
return lstm, y_train_fit
The model's summary:
lstm.summary()
Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
masking_7 (Masking) (None, None, 15) 0
lstm_6 (LSTM) (None, None, 30) 5520
dense_10 (Dense) (None, None, 1) 31
=================================================================
Total params: 5551 (21.68 KB)
Trainable params: 5551 (21.68 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
And the first epochs of training:
Epoch 1/100
18/18 [==============================] - 20s 335ms/step - loss: nan
Epoch 2/100
18/18 [==============================] - 6s 335ms/step - loss: nan
Epoch 3/100
18/18 [==============================] - 7s 365ms/step - loss: nan
Share
Improve this question
edited yesterday
Ajeet Verma
3,0563 gold badges16 silver badges27 bronze badges
asked yesterday
user26458368user26458368
1
1 Answer
Reset to default 0Assuming nans are at the end of the sequences, you can try:
Replacing all nan values in the input with 0. Or other values if they make more sense.
Cutting the sequences so length of the sequences is equal to minimal sequence length.
Duplicating last or first data point in sequences so all sequences are of same length(max sequence length).
Pick whichever makes more sense in your case. If you don't know which is better, try them all and compare the results.