scipy - Python vectorized minimization of a multivariate loss function without jacobian

I have a loss function that needs to be minimized

def loss(x: np.ndarray[float]) -> float

My problem has nDim=10 dimensions. Loss function works for 1D arrays of shape (nDim,), and with 2D arrays of shape (nSample, nDim) for an arbitrary number of samples. Because of the nature of the implementation of the loss function (numpy), it is significantly faster to make a single call to the loss function with several samples packed into 2D argument than to make several 1D calls.

The minimizer I am currently running is

sol = scipy.optimize.basinhopping(loss, x0, minimizer_kwargs={"method": "SLSQP"})

It does the job, but is too slow. As of current, the minimizer is making single 1D calls to the loss function. Based on observing the sample points, it seems, SLSQP is performing numerical differentiation, thus sampling 11 points for each 1 sample to calculate the gradient. Theoretically, it should be possible to implement this minimizer with vectorized function calls, requesting all 11 sample points from the loss function simultaneously.

I was hoping that there would be a vectorize flag for SLSQP, but it does not seem to be the case, please correct me if I am wrong.

Note also that the loss function is far too complicated for analytic calculation of derivatives, so explicit Jacobian not an option.

Question: Does Scipy or any other minimization library for python support a global optimization strategies (such as basinhopping) with vectorized loss function and no available Jacobian?

I have a loss function that needs to be minimized

def loss(x: np.ndarray[float]) -> float

The minimizer I am currently running is

sol = scipy.optimize.basinhopping(loss, x0, minimizer_kwargs={"method": "SLSQP"})

I was hoping that there would be a vectorize flag for SLSQP, but it does not seem to be the case, please correct me if I am wrong.

Note also that the loss function is far too complicated for analytic calculation of derivatives, so explicit Jacobian not an option.

Question: Does Scipy or any other minimization library for python support a global optimization strategies (such as basinhopping) with vectorized loss function and no available Jacobian?

Share Improve this question edited Feb 1 at 4:57 hpaulj 232k14 gold badges254 silver badges379 bronze badges asked Jan 31 at 12:55 Aleksejs Fomins 9081 gold badge13 silver badges29 bronze badges

Add a comment |

2 Answers 2

Sorted by: Reset to default 2

differential_evolution is a global optimizer that does not require gradients. It has a vectorized keyword to enable many function evaluations in a single call.

Alternatively, you could write a function that takes the Jacobian with scipy.differentiate.jacobian, which calls the function at all required points at once, and pass that as the Jacobian callable. However, it is designed for accuracy, not speed, so you should probably set coarse tolerances and low order.

Based on observing the sample points, it seems, SLSQP is performing numerical differentiation, thus sampling 11 points for each 1 sample to calculate the gradient. Theoretically, it should be possible to implement this minimizer with vectorized function calls, requesting all 11 sample points from the loss function simultaneously.

Yes, you can do that.

Here's an example. This implements a 2-point numerical differentiation approach, based on the code in scipy/optimize/_numdiff.py. It's very similar to what SciPy is doing when you don't give it a Jacobian.

The code optimizes an eggholder function, which is intended to have lots of local minima to get stuck in.

The important code is inside loss_to_jac(), which is responsible for converting the loss function into its Jacobian.

import scipy.optimize
import numpy as np
import functools
import time
import matplotlib.pyplot as plt


def eggholder(x):
    # vectorized version of eggholder function, originally from
    # scipy docs: https://docs.scipy./doc/scipy/reference/generated/scipy.optimize.shgo.html
    return (-(x[..., 1] + 47) * np.sin(np.sqrt(abs(x[..., 0]/2 + (x[..., 1]  + 47))))
            -x[..., 0] * np.sin(np.sqrt(abs(x[..., 0] - (x[..., 1]  + 47)))))


def plot_eggholder():
    x = np.arange(-512, 513)
    y = np.arange(-512, 513)
    xgrid, ygrid = np.meshgrid(x, y)
    xy = np.stack([xgrid, ygrid]).transpose(1, 2, 0)

    fig = plt.figure(figsize=(8,6))
    ax = fig.add_subplot(111, projection='3d')
    ax.view_init(45, -45)
    ax.plot_surface(xgrid, ygrid, eggholder(xy), cmap='terrain')
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    plt.show()
    

def loss_to_jac(func, N, eps=None):
    if eps is None:
        # Use square root of machine precision by default
        eps = (np.finfo(float).eps) ** 0.5
    array_index = np.arange(N)
    def jac(x0, *args, **kwargs):
        x0 = np.asarray(x0)
        evaluated_points = np.zeros((N + 1, N))
        evaluated_points += x0
        abs_step = x0 * eps
        # Avoid step of zero
        dx = ((x0 + abs_step) - x0)
        abs_step = np.where(dx == 0, eps, abs_step)
        # Find actual dx after rounding error, and replacing zeros
        dx = ((x0 + abs_step) - x0)
        # Add absolute step to lower diagonal of evaluated_points
        evaluated_points[1:][array_index, array_index] += abs_step
        result = func(evaluated_points, *args, **kwargs)
        f0 = result[0]
        df = result[1:] - f0
        J = df / dx
        return J
    return jac


def main():
    plot_eggholder()
    x0 = np.array([0, 0])

    jac_from_loss = loss_to_jac(eggholder, len(x0))
    for jac in [None, jac_from_loss]:
        start = time.time()
        print("Using jacobian:", jac)
        result = scipy.optimize.basinhopping(
            eggholder,
            x0,
            minimizer_kwargs=dict(
                method="SLSQP",
                jac=jac,
            ),
            niter=1000
        )
        print(result)
        print(f"Duration: {time.time() - start:.2f}")
        print("\n" * 2)


if __name__ == "__main__":
    main()

On my system, optimizing with a vectorized Jacobian is 40% faster. The exact speed will depend on how many dimensions you have, as eggholder is a 2-dimensional function, which requires fewer calls to numerically differentiate.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

scipy - Python vectorized minimization of a multivariate loss function without jacobian - Stack Overflow

2 Answers 2

与本文相关的文章

评论列表(0)