I am running (or trying to run) 3DDNA on CU's supercomputing cluster Alpine to assemble a genome from long read and short read/contact data (PacBio HIFI and Arima HIC). 3DDNA uses GNU Parallel to parallelize several steps in the assembly process. GNU parallel appears to use XDG base directory specification. I have had issues running it because it seems the $TMPDIR and $XDG_CACHE_HOME variables are incorrectly defined. I have defined both in .bashrc and .bash_profile as such:
export TMPDIR=/scratch/alpine/.colostate.edu/username/463/juicedir/tmp
export XDG_CACHE_HOME=/scratch/alpine/.colostate.edu/username/463/juicedir/cache
When I submit the job, it runs for ~25 seconds and I get this output:
###############
Starting iterating scaffolding with editing:
...starting round 0 of scaffolding:
:) -p flag was triggered. Running LIGer with GNU Parallel support parameter set to true.
:) -s flag was triggered, starting calculations with 15000 threshold starting contig/scaffold size
:) -q flag was triggered, starting calculations with 1 threshold mapping quality
...Using cprops file: 463_scaffolds.0.cprops
...Using merged_nodups file: 463_scaffolds.mnd.0.txt
...Scaffolding all scaffolds and contigs greater or equal to 15000 bp.
...Starting iteration # 1
parallel: Error: $XDG_CACHE_HOME can only contain [-a-z0-9_+,.%:/= ].
:) DONE!
...visualizing round 0 results:
:) -p flag was triggered. Running with GNU Parallel support parameter set to true.
:) -q flag was triggered, starting calculations for 1 threshold mapping quality
:) -i flag was triggered, building mapq without
:) -c flag was triggered, will remove temporary files after completion
...Remapping contact data from the original contig set to assembly
parallel: Error: $XDG_CACHE_HOME can only contain [-a-z0-9_+,.%:/= ].
This style of output continues; the program essentially runs with empty files that it creates, and the only error I can identify is
parallel: Error: $XDG_CACHE_HOME can only contain [-a-z0-9_+,.%:/= ].
I can't find a similar error reported elsewhere. Other background info is that originally I was getting
parallel: Error: $TMPDIR can only contain [-a-z0-9_+,.%:/= ].
and I went into each individual .sh file that the program calls and defined $TMPDIR
with the --tmpdir
flag in every GNU parallel command.
The last thing I tried was create $HOME/.cache as a symlink to my desired cache folder in scratch storage. Didn't work.
Any ideas or experience greatly appreciated.