最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

slurm + nextflow : invalid status line: `squeue: error: Invalid user: ?` - Stack Overflow

programmeradmin3浏览0评论

my colleague and I both use the same slurm-based cluster. I use nextflow daily on the same server without any problem. He uses snakemake+slurm daily on the same server. Today, he tried to use a NF workflow for the first time using my config and my main.nf file.

But on his side it looks like the jobs are marked as completed, without an exit status, without a '.exit' file (the .exit file is created later, when the job has ended, see below).

Feb-14 14:32:09.928 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[jobId: 6208569; id: 49; name: MAKE_MINI_BAM (PCRFree); status: COMPLETED; exit: -; error: -; workDir:path/to/nf-workdir/9b/ad1fedb4a9b1f37e07629735f35987 started: 1739539509896; exited: -; ]

Furthermore

and when we look at sacct, the job is still running (?)

$ sacct --cluster nautilus -j 6208569
JobID           JobName  Partition    Account  AllocCPUS      State ExitCode 
------------ ---------- ---------- ---------- ---------- ---------- -------- 
6208569      nf-MAKE_M+   standard     thorax          2    RUNNING      0:0 
6208569.bat+      batch                thorax          2    RUNNING      0:0 
6208569.ext+     extern                thorax          2    RUNNING      0:0 

and in the .nextflow.log there is this warning: " Invalid user: ?`"

Feb-17 14:50:22.215 [Task monitor] DEBUG nextflow.executor.SlurmExecutor - [SLURM] invalid status line: `squeue: error: Invalid user: ?`
Feb-17 14:50:22.215 [Task monitor] DEBUG nextflow.executor.SlurmExecutor - [SLURM] invalid status line: ``
Feb-17 14:51:22.275 [Task monitor] DEBUG nextflow.executor.SlurmExecutor - [SLURM] invalid status line: `squeue: error: Invalid user: ?`
Feb-17 14:51:22.276 [Task monitor] DEBUG nextflow.executor.SlurmExecutor - [SLURM] invalid status line: ``

On my side , there is no problem. what can be the source of this problem ? thanks !

PS: I don't have any specific config hidden in my home PS2: I also asked the NF slack

my colleague and I both use the same slurm-based cluster. I use nextflow daily on the same server without any problem. He uses snakemake+slurm daily on the same server. Today, he tried to use a NF workflow for the first time using my config and my main.nf file.

But on his side it looks like the jobs are marked as completed, without an exit status, without a '.exit' file (the .exit file is created later, when the job has ended, see below).

Feb-14 14:32:09.928 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[jobId: 6208569; id: 49; name: MAKE_MINI_BAM (PCRFree); status: COMPLETED; exit: -; error: -; workDir:path/to/nf-workdir/9b/ad1fedb4a9b1f37e07629735f35987 started: 1739539509896; exited: -; ]

Furthermore

and when we look at sacct, the job is still running (?)

$ sacct --cluster nautilus -j 6208569
JobID           JobName  Partition    Account  AllocCPUS      State ExitCode 
------------ ---------- ---------- ---------- ---------- ---------- -------- 
6208569      nf-MAKE_M+   standard     thorax          2    RUNNING      0:0 
6208569.bat+      batch                thorax          2    RUNNING      0:0 
6208569.ext+     extern                thorax          2    RUNNING      0:0 

and in the .nextflow.log there is this warning: " Invalid user: ?`"

Feb-17 14:50:22.215 [Task monitor] DEBUG nextflow.executor.SlurmExecutor - [SLURM] invalid status line: `squeue: error: Invalid user: ?`
Feb-17 14:50:22.215 [Task monitor] DEBUG nextflow.executor.SlurmExecutor - [SLURM] invalid status line: ``
Feb-17 14:51:22.275 [Task monitor] DEBUG nextflow.executor.SlurmExecutor - [SLURM] invalid status line: `squeue: error: Invalid user: ?`
Feb-17 14:51:22.276 [Task monitor] DEBUG nextflow.executor.SlurmExecutor - [SLURM] invalid status line: ``

On my side , there is no problem. what can be the source of this problem ? thanks !

PS: I don't have any specific config hidden in my home PS2: I also asked the NF slack https://nextflow.slack/archives/C02T98A23U7/p1739540301422199

Share Improve this question edited Feb 17 at 16:03 Pierre asked Feb 17 at 15:42 PierrePierre 35.3k32 gold badges119 silver badges196 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 1

Nextflow polls the SLURM queue using:

squeue --noheader -o "%i %t" -t all -u <username>

In the first instance, I would have your colleague try this command with his username. Internally, Nextflow checks System.getProperty('user.name') to get the username and appends it to the squeue command1. Check also to see if $USER is set in the environment Nextflow is run. It might not be being set when starting an interactive session for example.

in the end, that was 'just' a problem with the java instance installed alongside nextflow with conda/mamba. The NF was using the wrong local version of java. I asked my collaborator to install both softwares, to setup PATH and JAVA_HOME and everything went fine.

发布评论

评论列表(0)

  1. 暂无评论