Keras tuner suggests that the best val_loss I can get is 9920.19 but when I run a model with the best parameteres I am not obtaining a best val_loss anywhere near that. It also suggests the best epochs is 17, but again when I run the model with those best parameters, the model is inducing early stopping at epoch 7. I previously ran the model using get_best_parameters but had the same issue so tried get_best_model but I am obtaining similar issues. For context, I am trying to predict price of cars using both numerical and categorical features.
The code is as follows:
#hp- take 1 parameter
def model_builder(hp):
model = tf.keras.Sequential()
model.add(tf.keras.layers.Flatten(input_dim = (train_final.shape[1])))
#creating activation choices
hp_activation = hp.Choice('activation', values = ['relu', 'tanh'])
#creating node choices
hp_layer_1 = hp.Int('layer_1', min_value=1, max_value=500, step=100)
hp_layer_2 = hp.Int('layer_2', min_value=1, max_value=500, step=100)
#creating learning rate choice
hp_learning_rate = hp.Choice('learning_rate', values = [1e-2, 1e-3, 1e-4])
#specifies first layer after the flatten layer
model.add(tf.keras.layers.Dense(units = hp_layer_1, activation = hp_activation))
#creating the second layer
model.add(tf.keras.layers.Dense(units = hp_layer_2, activation = hp_activation))
model.add(tf.keras.layers.Dense(1, activation='linear'))
modelpile(optimizer=tf.keras.optimizers.Adam(learning_rate = hp_learning_rate),
loss = tf.keras.losses.MeanSquaredError(), metrics = ['mean_absolute_error'])
return model
import keras_tuner as kt
tuner = kt.Hyperband(model_builder,
objective = 'val_loss',
max_epochs = 50,
factor = 3,
directory = 'dir',
project_name = 'x',
overwrite = True) # makes tuner rewrite over old tuning experiments
stop_early = tf.keras.callbacks.EarlyStopping(monitor = 'val_loss', patience = 3)
tuner.search(train_final, y_train, epochs = 50, validation_split = 0.2, callbacks = [stop_early])
Output:
Trial 90 Complete [00h 00m 37s]
val_loss: 169995.390625
Best val_loss So Far: 9920.19140625
Total elapsed time: 00h 12m 08s
Further code:
best_hp = tuner.get_best_hyperparameters(num_trials=1)[0]
best_hp.values
Output:
{'activation': 'relu',
'layer_1': 301,
'layer_2': 401,
'learning_rate': 0.001,
'tuner/epochs': 50,
'tuner/initial_epoch': 17,
'tuner/bracket': 2,
'tuner/round': 2,
'tuner/trial_id': '0068'}
Code I am using to run the best model:
#obtaining the best model
best_model = tuner.get_best_models(num_models = 1)[0]
best_model.summary()
Output summary:
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ flatten (Flatten) │ (None, 26) │ 0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense) │ (None, 301) │ 8,127 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense) │ (None, 401) │ 121,102 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense) │ (None, 1) │ 402 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 388,895 (1.48 MB)
Trainable params: 129,631 (506.37 KB)
Non-trainable params: 0 (0.00 B)
Optimizer params: 259,264 (1012.75 KB)
Then obtaining the history of the model:
history = best_model.fit(train_final, y_train, epochs = 50, validation_split = 0.2, callbacks=[stop_early])
With the below as the epochs and val loss achieved:
Epoch 1/50
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 2s 1ms/step - loss: 13584.2109 - mean_absolute_error: 66.0554 - val_loss: 10938.3945 - val_mean_absolute_error: 65.2311
Epoch 2/50
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - loss: 15795.5977 - mean_absolute_error: 70.9460 - val_loss: 11011.2217 - val_mean_absolute_error: 64.9488
Epoch 3/50
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - loss: 14230.2363 - mean_absolute_error: 65.1442 - val_loss: 13075.5859 - val_mean_absolute_error: 68.2900
Epoch 4/50
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - loss: 12711.3496 - mean_absolute_error: 64.8533 - val_loss: 8809.2930 - val_mean_absolute_error: 52.3436
Epoch 5/50
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - loss: 34036.5078 - mean_absolute_error: 81.4700 - val_loss: 11076.9854 - val_mean_absolute_error: 56.1762
Epoch 6/50
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - loss: 9363.7236 - mean_absolute_error: 57.5769 - val_loss: 12070.8613 - val_mean_absolute_error: 74.5882
Epoch 7/50
1000/1000 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - loss: 10689.0205 - mean_absolute_error: 60.8380 - val_loss: 24639.5000 - val_mean_absolute_error: 86.1563
Previous to this, both numerical and categorical data has been concatenated. This is my first time using keras tuner so any advice would be appreciated. Is it normal to not obtain the same results when running the model after finding the best parameters?