machine learning - GaussianProcessRegressor object in scikit-learn: select fixed hyperparameters, cannot reproduce optimised ker

I am trying to understand the GaussianProcessRegressor object in scikit-learn alas, unsuccessfully.

Considering the example in the documentation Example with noisy targets, which I copy below for convenience (with the minor change of using a ConstantKernel instead of multiplication by a constant in the kernel definition)

import numpy as np

X = np.linspace(start=0, stop=10, num=1_000).reshape(-1, 1)
y = np.squeeze(X * np.sin(X))

#############
#############
noise_std = 0.75


import matplotlib.pyplot as plt

plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("True generative process")
rng = np.random.RandomState(1)

training_indices = rng.choice(np.arange(y.size), size=6, replace=False)
X_train, y_train = X[training_indices], y[training_indices]


noise_std = 0.75
y_train_noisy = y_train + rng.normal(loc=0.0, scale=noise_std, size=y_train.shape)
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, WhiteKernel,ConstantKernel

# kernel = 1 * RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e2)) + WhiteKernel()
kernel =  ConstantKernel(constant_value=1)*RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e2)) 
gaussian_process = GaussianProcessRegressor(kernel=kernel, alpha=noise_std**2, n_restarts_optimizer=9)

gaussian_process.fit(X_train, y_train_noisy)
gaussian_process.kernel_
mean_prediction, std_prediction = gaussian_process.predict(X, return_std=True)
plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
plt.errorbar(
    X_train,
    y_train_noisy,
    noise_std,
    linestyle="None",
    color="tab:blue",
    marker=".",
    markersize=10,
    label="Observations",
)
plt.plot(X, mean_prediction, label="Mean prediction")
plt.fill_between(
    X.ravel(),
    mean_prediction - 1.96 * std_prediction,
    mean_prediction + 1.96 * std_prediction,
    color="tab:orange",
    alpha=0.5,
    label=r"95% confidence interval",
)
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression on a noisy dataset")

I get these results

Now, I would like to get the same results by using a kernel with 'fixed'parameters (for some other purpose which is irrelevant for the question). So, I get the optimised hyperparameters of the kernel above,

gaussian_process.kernel_.get_params()

outputting

{'k1': 4.28**2,
 'k2': RBF(length_scale=1.1),
 'k1__constant_value': 18.30421069841903,
 'k1__constant_value_bounds': (1e-05, 100000.0),
 'k2__length_scale': 1.1043558649730463,
 'k2__length_scale_bounds': (0.01, 100.0)}

So, I modify the previous kernel & gaussian process definition

kernel_fixed = ConstantKernel(constant_value=18.30, constant_value_bounds='fixed') *RBF(length_scale=1.1043, length_scale_bounds='fixed') 
gaussian_process_fixed = GaussianProcessRegressor(kernel=kernel_fixed, alpha=noise_std**2, n_restarts_optimizer=9)
gaussian_process_fixed.fit(X,y)
mean_prediction, std_prediction = gaussian_process_fixed.predict(X, return_std=True)

plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
plt.errorbar(
    X_train,
    y_train_noisy,
    noise_std,
    linestyle="None",
    color="tab:blue",
    marker=".",
    markersize=10,
    label="Observations",
)
plt.plot(X, mean_prediction, label="Mean prediction")
plt.fill_between(
    X.ravel(),
    mean_prediction - 1.96 * std_prediction,
    mean_prediction + 1.96 * std_prediction,
    color="tab:orange",
    alpha=0.5,
    label=r"95% confidence interval",
)
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression on a noisy dataset")

but the results are very different, and yet kernel's parameters are as I fixed them, very close to the optimised ones

{'k1': 4.28**2,
 'k2': RBF(length_scale=1.1),
 'k1__constant_value': 18.3,
 'k1__constant_value_bounds': 'fixed',
 'k2__length_scale': 1.1043,
 'k2__length_scale_bounds': 'fixed'}

What is that I am missing??

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

machine learning - GaussianProcessRegressor object in scikit-learn: select fixed hyperparameters, cannot reproduce optimised ker

与本文相关的文章

评论列表(0)