最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

parallel processing - Mental Model for Hybrid MPIOpenMP with SLURM - Stack Overflow

programmeradmin0浏览0评论

Question

Edit: If you feel the question could be improved, please comment with suggestions since downvoting without comment is not particularly constructive.

I am trying to develop a clear mental model for using SLURM to request resources on HPC systems for hybrid MPI/OpenMP jobs. In thinking about it more, I realized there are some gaps in my understanding. I use SLURM terminology now where "CPU" means a single core. Is the below image a correct model for how a command like

srun -–ntasks=2 --cpus-per-task=4 --hint=nomultithread hybrid_ompmpi.bin 

allocates resources from a simple cluster with a single compute node consisting of a single socket with 8 physical CPUs and each physical CPU has 2 threads (for a total of 8 * 2 = 16 logical CPUs)? Note that hybrid_ompmpi.bin is just a dummy program name.

My understanding is that in the requested resources, since --hint=nomultithread, only a single thread per CPU is utilized. Moreover, each MPI process will utilize 4 CPUs (though this seems off to me since I normally think of 1 MPI process per CPU).

Context/Definitions

SLURM calls (see [R1]) a physical/logical core (see [R7]) a CPU. The CPU (microprocessor chip) is called a socket in SLURM. Compute nodes can, of course, have multiple sockets.

Threaded memory model taken from [R3] and relation to processes taken from [R4].

From [R3], a process is an independent unit of computation that has ownership of a portion of memory and control over resources in user space.

The meaning of things like nodes, tasks, cpus per task, etc. taken partially from [R5].

The meaning of --hint=nomultihread taken from [R6].

MPI processes and cores [R8].

References

[R1] : University Sigen: SLURM Terminology

[R2] : Figure 3 from What Every Computer Programmer Should Know About Memory

[R3] : Chapter 7 and Chapter 8 of Parallel and High Performance Computing

[R4] : SO: Does each process have it's own section of data, text , stack and heap in the memory?

[R5] : SO: HPC cluster: select the number of CPUs and threads in SLURM sbatch

[R6] : man srun

[R7] : SO: So what are logical cpu cores (as opposed to physical cpu cores)?

[R8] : SO: MPI cores or processors?

Question

Edit: If you feel the question could be improved, please comment with suggestions since downvoting without comment is not particularly constructive.

I am trying to develop a clear mental model for using SLURM to request resources on HPC systems for hybrid MPI/OpenMP jobs. In thinking about it more, I realized there are some gaps in my understanding. I use SLURM terminology now where "CPU" means a single core. Is the below image a correct model for how a command like

srun -–ntasks=2 --cpus-per-task=4 --hint=nomultithread hybrid_ompmpi.bin 

allocates resources from a simple cluster with a single compute node consisting of a single socket with 8 physical CPUs and each physical CPU has 2 threads (for a total of 8 * 2 = 16 logical CPUs)? Note that hybrid_ompmpi.bin is just a dummy program name.

My understanding is that in the requested resources, since --hint=nomultithread, only a single thread per CPU is utilized. Moreover, each MPI process will utilize 4 CPUs (though this seems off to me since I normally think of 1 MPI process per CPU).

Context/Definitions

SLURM calls (see [R1]) a physical/logical core (see [R7]) a CPU. The CPU (microprocessor chip) is called a socket in SLURM. Compute nodes can, of course, have multiple sockets.

Threaded memory model taken from [R3] and relation to processes taken from [R4].

From [R3], a process is an independent unit of computation that has ownership of a portion of memory and control over resources in user space.

The meaning of things like nodes, tasks, cpus per task, etc. taken partially from [R5].

The meaning of --hint=nomultihread taken from [R6].

MPI processes and cores [R8].

References

[R1] : University Sigen: SLURM Terminology

[R2] : Figure 3 from What Every Computer Programmer Should Know About Memory

[R3] : Chapter 7 and Chapter 8 of Parallel and High Performance Computing

[R4] : SO: Does each process have it's own section of data, text , stack and heap in the memory?

[R5] : SO: HPC cluster: select the number of CPUs and threads in SLURM sbatch

[R6] : man srun

[R7] : SO: So what are logical cpu cores (as opposed to physical cpu cores)?

[R8] : SO: MPI cores or processors?

Share Improve this question edited Jan 31 at 5:29 Jared asked Jan 30 at 13:34 JaredJared 6381 gold badge5 silver badges17 bronze badges 3
  • I am afraid I failed to read a question here. MPI+OpenMP is a two steps tango: first allocate MPI tasks with several cores each, then have the OpenMP runtime use as many OpenMP threads as allocated cores. – Gilles Gouaillardet Commented Jan 30 at 13:53
  • @GillesGouaillardet I bolded my question, perhaps maybe now it is clearer? Though I think your comment basically answers my question since I think my mental model reflects what you have said. – Jared Commented Jan 30 at 13:59
  • 1 Looks good to me! you can srun ... -l grep Cpus_allowed_list /proc/self/status to double check how resources are allocated. – Gilles Gouaillardet Commented Jan 30 at 14:25
Add a comment  | 

1 Answer 1

Reset to default 0

To summarize from the comments:

A hybrid MPI + OpenMP approach consists allocating tasks where each task consists of physical and/or logical cores. OpenMP then uses threads based on the physical and/or logical cores that are available to a given task. In the example in the question, each task gets 4 physical cores and OpenMP uses one thread on each physical core---as opposed to say 2 threads on 2 physical cores, which may occur depending on the operating system scheduling should the user not pass

--hint=nomultithread

Therefore, the mental model shown in the question is correct.

发布评论

评论列表(0)

  1. 暂无评论