LSTM for the binary classification.
Hi all,
I am using LSTM to perform binary classification on time series data (normal, abnormal). I trained two models, one using a single LSTM layer and another using multiple layers. However, I am consistently getting predictions for only one class (normal), even when testing with the training data. Could you please help me understand why this is happening and how to resolve it? My dataset is nearly balanced, and during training, I observe a reduction in loss and an increase in accuracy with each epoch.
The model during training:
Model: "model_63"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_71 (InputLayer) [(None, 2048, 1)] 0
lstm_133 (LSTM) (None, 128) 66560
dense_67 (Dense) (None, 2) 258
=================================================================
Total params: 66,818
Trainable params: 66,818
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
93/93 [==============================] - 347s 4s/step - loss: 0.7004 - accuracy: 0.1984
Epoch 2/5
93/93 [==============================] - 333s 4s/step - loss: 0.6918 - accuracy: 0.7068
Epoch 3/5
93/93 [==============================] - 265s 3s/step - loss: 0.6827 - accuracy: 0.9560
Epoch 4/5
93/93 [==============================] - 171s 2s/step - loss: 0.6721 - accuracy: 0.9701
Epoch 5/5
93/93 [==============================] - 257s 3s/step - loss: nan - accuracy: 0.9701
(e.g.) One layer: The model predict all X_test_seq_in as normal
visible = Input(shape=(n_in_x_train,1))
layer = LSTM(128, activation = 'relu')(visible)
output = Dense(2, activation='softmax')(layer)
#tie up the encoder and the 2 decoders
model = Model(inputs = visible, outputs=output)
optimizer = tf.keras.optimizers.Adam(learning_rate=0.00001)
modelpile(optimizer=optimizer, loss='categorical_crossentropy', metrics =['accuracy'])
model.summary()
#fit the model
y_train = to_categorical(y_train, num_classes=2)
model.fit(X_train_seq_in, y_train, epochs = 2, verbose = 1)
y_pred = model.predict(X_test_seq_in, verbose = 0)
predicted_labels = np.argmax(y_pred, axis=1)