In the configuration management library Hydra, it is possible to only partially instantiate classes defined in configuration using the _partial_
keyword. The library explains that this results in a functools.partial
. I wonder how this interacts with seeding. E.g. with
- pytorch
torch.manual_seed()
- lightnings
seed_everything
- etc.
My reasoning is, that if I use the _partial_
keyword while specifying all parameters for __init__
, then I would essentially obtain a factory which could be called after specifying the seed to do multiple runs. But this assumes that _partial_
does not bake the seed in already. To my understanding that should not be the case. Is that correct?
In the configuration management library Hydra, it is possible to only partially instantiate classes defined in configuration using the _partial_
keyword. The library explains that this results in a functools.partial
. I wonder how this interacts with seeding. E.g. with
- pytorch
torch.manual_seed()
- lightnings
seed_everything
- etc.
My reasoning is, that if I use the _partial_
keyword while specifying all parameters for __init__
, then I would essentially obtain a factory which could be called after specifying the seed to do multiple runs. But this assumes that _partial_
does not bake the seed in already. To my understanding that should not be the case. Is that correct?
3 Answers
Reset to default 1Before using hydra.utils.instantiate
no third party code is not run by hydra. So you can set your seeds before each use of instantiate; or if a partial
before each call to the partial.
Here a complete toy example, based on Hydra's doc overview, which creates a partial to instantiate an optimizer or a model, that takes a callable optim_partial
as an argument.
# config.yaml
model:
_target_: "__main.__.MyModel"
optim_partial:
_partial_: true
_target_: __main__.MyOptimizer
algo: SGD
lr: 0.01
from functools import partial
from typing import Callable
import random
from pprint import pprint
import hydra
from omegaconf import DictConfig, OmegaConf
class MyModel:
def __init__(self, lr, optim_partial: Callable[..., "MyOptimizer"]):
self.optim_partial = optim_partial
self.optim1 = self.optim_partial()
self.optim2 = self.optim_partial()
class MyOptimizer:
def __init__(self, algo):
print(algo, random.randint(0, 10000))
@hydra.main(config_name="config", config_path="./", version_base=None)
def main(cfg: DictConfig):
# Check out the config
pprint(OmegaConf.to_container(cfg, resolve=False))
print(type(cfg.model.optim_partial))
# Create the functools.partial
optim_partial: partial[MyOptimizer] = hydra.utils.instantiate(cfg.model.optim_partial)
# Set the seed before you call the a partial
random.seed(42)
optimizer1: MyOptimizer = optim_partial()
optimizer2: MyOptimizer = optim_partial()
random.seed(42)
optimizer1b: MyOptimizer = optim_partial()
optimizer2b: MyOptimizer = optim_partial()
# model is not a partial; use seed before creation
random.seed(42)
model: MyModel = hydra.utils.instantiate(cfg.model)
if __name__ == "__main__":
main()
# Output
{'model': {'_target_': '__main__.MyModel',
'lr': 0.01,
'optim_partial': {'_partial_': True,
'_target_': '__main__.MyOptimizer',
'algo': 'SGD'}}}
type of cfg.model.optim_partial <class 'omegaconf.dictconfig.DictConfig'>
SGD 1824
SGD 409
SGD 1824
SGD 409
SGD 1824
SGD 409
Generally speaking Hydra is independent of PyTorch and does not directly interact with (except via plugins).
_partial_
has nothing at all to do with PyTorch or seeding.
At a glance what you are suggesting should work, but it's best if you just verify it.
Your understanding is correct.Using partial in Hydra simply returns functools.partial
object and doesnt immediately execute class constructor or otherwise "bake in" seed.As result seed is not fixed at the time of creating that partial.You can safely call torch.manual_seed(...)
or any other seed-setting functions just before you invoke the partial object multiple times for reproducible runs.
A common pattern is something like:
import torch
from hydra import compose, initialize
# For example your Hydra config defines partial for your model
with initialize(config_path="conf",version_base=None):
cfg =compose(config_name="config")
model_partial=cfg.model # A functools.partial(MyModel, ...)
# Then in your experiment loop :
for seed in [123, 456, 789]:
torch.manual_seed(seed)
model = model_partial() # Actually instantiate model with given seed
# train or evaluate your model