I am trying to tune a scikit-learn Random Forest Classifier with the appropiate keras tuner interface, but passing validation_split=0.3
doesn't seem to work. Here is my code:
def build_model(hp):
model = RandomForestClassifier(
n_estimators=hp.Choice("n_estimators", [50, 100, 150]),
random_state=42,
min_samples_split=hp.Choice("min_samples_split", [2, 5]),
min_samples_leaf=hp.Choice("min_samples_leaf", [2, 5])
)
return model
tuner = keras_tuner.tuners.SklearnTuner(
oracle=keras_tuner.oracles.BayesianOptimizationOracle(
objective=keras_tuner.Objective('score', 'max'),
max_trials=10),
hypermodel=build_model,
scoring=metrics.make_scorer(metrics.accuracy_score),
cv=model_selection.StratifiedKFold(5),
directory='.',
project_name='random_forest_search'
)
tuner.search(X_train, y_train, validation_split=0.3)
The exact error code is TypeError: SklearnTuner.search() got an unexpected keyword argument 'validation_split'
. So how do I split my data in train/validation?
I am trying to tune a scikit-learn Random Forest Classifier with the appropiate keras tuner interface, but passing validation_split=0.3
doesn't seem to work. Here is my code:
def build_model(hp):
model = RandomForestClassifier(
n_estimators=hp.Choice("n_estimators", [50, 100, 150]),
random_state=42,
min_samples_split=hp.Choice("min_samples_split", [2, 5]),
min_samples_leaf=hp.Choice("min_samples_leaf", [2, 5])
)
return model
tuner = keras_tuner.tuners.SklearnTuner(
oracle=keras_tuner.oracles.BayesianOptimizationOracle(
objective=keras_tuner.Objective('score', 'max'),
max_trials=10),
hypermodel=build_model,
scoring=metrics.make_scorer(metrics.accuracy_score),
cv=model_selection.StratifiedKFold(5),
directory='.',
project_name='random_forest_search'
)
tuner.search(X_train, y_train, validation_split=0.3)
The exact error code is TypeError: SklearnTuner.search() got an unexpected keyword argument 'validation_split'
. So how do I split my data in train/validation?
1 Answer
Reset to default 0Sklearn Tuner
is designed to use cross-validation (CV) for model selection, not the traditional train-validation-test split. Since you've already provided the cv
argument, the tuner will perform cross-validation automatically. Therefore, you can remove the validation_split
argument, and your code should work as expected.