最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

multiprocessing - python multiprocess sharing value with Value not working as documented - Stack Overflow

programmeradmin3浏览0评论

I'm learning sharing variables in multiprocess. The official doc says

Data can be stored in a shared memory map using Value...

and the example works fine.

But got error when I try to use it with Pool.map:

from multiprocessing import Value, Pool, Manager


def f(args):
    n, = args
    n.value = 1


if __name__ == '__main__':
    n = Value('d', 0.0)
    # n = Manager().Value('d', 0.0) # can workaround the error
    with Pool(1) as pool:
        pool.map(f, [(n,)])
    # RuntimeError: Synchronized objects should only be shared between processes through inheritance
    print(n.value)

trace back

Traceback (most recent call last):
  File "D:\0ly\ly\processvaldoc.py", line 13, in <module>
    pool.map(f, [(n,)])
  File "C:\a3\envs\skl\Lib\multiprocessing\pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\a3\envs\skl\Lib\multiprocessing\pool.py", line 774, in get
    raise self._value
  File "C:\a3\envs\skl\Lib\multiprocessing\pool.py", line 540, in _handle_tasks
    put(task)
  File "C:\a3\envs\skl\Lib\multiprocessing\connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\a3\envs\skl\Lib\multiprocessing\reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
  File "C:\a3\envs\skl\Lib\multiprocessing\sharedctypes.py", line 199, in __reduce__
    assert_spawning(self)
  File "C:\a3\envs\skl\Lib\multiprocessing\context.py", line 374, in assert_spawning
    raise RuntimeError(
RuntimeError: Synchronized objects should only be shared between processes through inheritance

my python version is 3.12.9 64bit on Win11.

Since there are so many ways to start multiprocess, I have never read all the documents or tried them. Just wonder what's the essential difference of documented Process.start() and Pool.map that leads to the latter failure?

I searched and learned Manager().Value can solve. But what's its magic? since I'm not using with Manager() as style. If Manager().Value works in both(maybe all) scenarios, why need design another multiprocessing.Value which only partly worked?

I'm learning sharing variables in multiprocess. The official doc says

Data can be stored in a shared memory map using Value...

and the example works fine.

But got error when I try to use it with Pool.map:

from multiprocessing import Value, Pool, Manager


def f(args):
    n, = args
    n.value = 1


if __name__ == '__main__':
    n = Value('d', 0.0)
    # n = Manager().Value('d', 0.0) # can workaround the error
    with Pool(1) as pool:
        pool.map(f, [(n,)])
    # RuntimeError: Synchronized objects should only be shared between processes through inheritance
    print(n.value)

trace back

Traceback (most recent call last):
  File "D:\0ly\ly\processvaldoc.py", line 13, in <module>
    pool.map(f, [(n,)])
  File "C:\a3\envs\skl\Lib\multiprocessing\pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\a3\envs\skl\Lib\multiprocessing\pool.py", line 774, in get
    raise self._value
  File "C:\a3\envs\skl\Lib\multiprocessing\pool.py", line 540, in _handle_tasks
    put(task)
  File "C:\a3\envs\skl\Lib\multiprocessing\connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\a3\envs\skl\Lib\multiprocessing\reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
  File "C:\a3\envs\skl\Lib\multiprocessing\sharedctypes.py", line 199, in __reduce__
    assert_spawning(self)
  File "C:\a3\envs\skl\Lib\multiprocessing\context.py", line 374, in assert_spawning
    raise RuntimeError(
RuntimeError: Synchronized objects should only be shared between processes through inheritance

my python version is 3.12.9 64bit on Win11.

Since there are so many ways to start multiprocess, I have never read all the documents or tried them. Just wonder what's the essential difference of documented Process.start() and Pool.map that leads to the latter failure?

I searched and learned Manager().Value can solve. But what's its magic? since I'm not using with Manager() as style. If Manager().Value works in both(maybe all) scenarios, why need design another multiprocessing.Value which only partly worked?

Share Improve this question edited Mar 14 at 13:39 Lei Yang asked Mar 14 at 12:07 Lei YangLei Yang 4,3877 gold badges40 silver badges67 bronze badges 6
  • Please provide the error dump. How to Ask – AcK Commented Mar 14 at 12:21
  • In either case, shared variables can only be passed to subprocesses through inheritance. You cannot pass shared variables after the subprocess has already started. Take a close look at examples that use the Process. You will notice that shared variables are always passed before starting the subprocess. – ken Commented Mar 14 at 12:39
  • @AcK thanks, do you mean the trace back lines? just added. but i'm not sure what dump should be pasted. – Lei Yang Commented Mar 14 at 13:36
  • @ken, i init my var just before process, aren't i? – Lei Yang Commented Mar 14 at 13:37
  • 1 This question is similar to: Combine Pool.map with shared memory Array in Python multiprocessing. If you believe it’s different, please edit the question, make it clear how it’s different and/or how the answers on that question are not helpful for your problem. – MT0 Commented Mar 14 at 13:48
 |  Show 1 more comment

1 Answer 1

Reset to default 1

multiprocessing.Value uses shared memory. and expects it to be passed to child processes at startup. Pool(...) creates the child processes, pool.map only dispatches tasks through a queue.

Conceptually there is no technical reason it cannot be passed to children after they start, it's just the way it is, and it simplifies all implementations greatly, and maybe improves security.

Most operating systems allow you to create anonymous private shared memory that only child process can inherit on startup, but you can just use a named shared memory which can be passed after startup, PyTorch shared memory does this.

you can pass them to the children of Pool using the initializer argument.

from multiprocessing import Pool, Value

global_var = None

def initializer_func(shared_var):
    global global_var
    global_var = shared_var

def f(args):
    print(f"{args=}", flush=True)
    print(f"{global_var=}", flush=True)
    n = global_var
    n.value, = args


if __name__ == '__main__':
    n = Value('d', 0.0)
    # n = Manager().Value('d', 0.0) # can workaround the error
    with Pool(1, initializer=initializer_func, initargs=(n,)) as pool:
        pool.map(f, [(2,)])
    # RuntimeError: Synchronized objects should only be shared between processes through inheritance
    print(f"{n.value=}")
args=(2,)
global_var=<Synchronized wrapper for c_double(0.0)>
n.value=2.0

Manager doesn't use shared memory, it is a process that stores the objects locally and uses sockets for communication. it is basically Redis implemented in python to allow RPC calls.

You spawn a manager process that other processes send packets to it saying get this value and set this value to x. it is much much slower, and the spawned process consumes system resources. but any process can connect to, not necessarily a child process, or even not on the same PC if you set its port to a public one manually. it uses some encryption1 to get around the potential vulnerabilities.


The biggest limitations of shared memory are, it cannot store pointers, and it cannot dynamically grow2, which means it cannot store generic python objects, it only stores basic types like int and float and str, see ShareableList Types. whereas Manager can handle any object, so it is useful for creating a dictionary or a list of complex python objects.

1. IMHO Manager doesn't have enough encryption, i wouldn't recommend exposing it to the web, better use grpc + SSL if you want actual security.

2. shared memory can grow under a few OSes but synchronization of the growth is not practical.

发布评论

评论列表(0)

  1. 暂无评论