python - How does Hydra `_partial_` interact with seeding

In the configuration management library Hydra, it is possible to only partially instantiate classes defined in configuration using the _partial_ keyword. The library explains that this results in a functools.partial. I wonder how this interacts with seeding. E.g. with

pytorch torch.manual_seed()
lightnings seed_everything
etc.

My reasoning is, that if I use the _partial_ keyword while specifying all parameters for __init__, then I would essentially obtain a factory which could be called after specifying the seed to do multiple runs. But this assumes that _partial_ does not bake the seed in already. To my understanding that should not be the case. Is that correct?

pytorch torch.manual_seed()
lightnings seed_everything
etc.

Share Improve this question edited Feb 4 at 10:22 Daraan 3,9777 gold badges22 silver badges46 bronze badges asked Feb 3 at 15:28 Felix Benning 1,2024 gold badges14 silver badges30 bronze badges

Add a comment |

3 Answers 3

Sorted by: Reset to default 1

Before using hydra.utils.instantiate no third party code is not run by hydra. So you can set your seeds before each use of instantiate; or if a partial before each call to the partial.

Here a complete toy example, based on Hydra's doc overview, which creates a partial to instantiate an optimizer or a model, that takes a callable optim_partial as an argument.

# config.yaml
model:
  _target_: "__main.__.MyModel"
  optim_partial:
    _partial_: true
    _target_: __main__.MyOptimizer
    algo: SGD
  lr: 0.01

from functools import partial
from typing import Callable
import random
from pprint import pprint

import hydra
from omegaconf import DictConfig, OmegaConf


class MyModel:
    def __init__(self, lr, optim_partial: Callable[..., "MyOptimizer"]):
        self.optim_partial = optim_partial
        self.optim1 = self.optim_partial()
        self.optim2 = self.optim_partial()


class MyOptimizer:
    def __init__(self, algo):
        print(algo, random.randint(0, 10000))


@hydra.main(config_name="config", config_path="./", version_base=None)
def main(cfg: DictConfig):
    # Check out the config
    pprint(OmegaConf.to_container(cfg, resolve=False))
    print(type(cfg.model.optim_partial))
    
    # Create the functools.partial
    optim_partial: partial[MyOptimizer] = hydra.utils.instantiate(cfg.model.optim_partial)
    # Set the seed before you call the a partial
    random.seed(42)
    optimizer1: MyOptimizer = optim_partial()
    optimizer2: MyOptimizer = optim_partial()
    random.seed(42)
    optimizer1b: MyOptimizer = optim_partial()
    optimizer2b: MyOptimizer = optim_partial()

    # model is not a partial; use seed before creation
    random.seed(42)
    model: MyModel = hydra.utils.instantiate(cfg.model)


if __name__ == "__main__":
    main()

# Output
{'model': {'_target_': '__main__.MyModel',
           'lr': 0.01,
           'optim_partial': {'_partial_': True,
                             '_target_': '__main__.MyOptimizer',
                             'algo': 'SGD'}}}
type of cfg.model.optim_partial <class 'omegaconf.dictconfig.DictConfig'>
SGD 1824
SGD 409
SGD 1824
SGD 409
SGD 1824
SGD 409

Generally speaking Hydra is independent of PyTorch and does not directly interact with (except via plugins). _partial_ has nothing at all to do with PyTorch or seeding.

At a glance what you are suggesting should work, but it's best if you just verify it.

Your understanding is correct.Using partial in Hydra simply returns functools.partial object and doesnt immediately execute class constructor or otherwise "bake in" seed.As result seed is not fixed at the time of creating that partial.You can safely call torch.manual_seed(...) or any other seed-setting functions just before you invoke the partial object multiple times for reproducible runs.

A common pattern is something like:

import torch
from hydra import compose, initialize

# For example your Hydra config defines partial for your model
with initialize(config_path="conf",version_base=None):
    cfg =compose(config_name="config")

model_partial=cfg.model  # A functools.partial(MyModel, ...)

# Then in your experiment loop :
for seed in [123, 456, 789]:
    torch.manual_seed(seed)
    model = model_partial()  # Actually instantiate model with given seed
    # train or evaluate your model

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

python - How does Hydra `_partial_` interact with seeding - Stack Overflow

3 Answers 3

与本文相关的文章

评论列表(0)