最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Snakemake - Call multiple scripts in a rule - Stack Overflow

programmeradmin2浏览0评论

I have a snakemake rule where I call a python script with the script keyword:

rule merge_results:
    input:
        [...]
    output:
        path_results
    script:
        "merge_results.py"

However, I would like to also call an R script in this rule, depending on a condition. I wanted to do something like this:

rule merge_results:
    input:
        [...]
    output:
        path_results
    script:
        "merge_results.py"
        if cond:
            "my_script.R"

but Snakemake won't allow it. I know I could call my scripts by using the run and shell() keywords, but my python and R scripts use a lot of times the snakemake.input variable, so I would have to change a lot of things in their code in order to call them in a different way. Do you know if I can avoid that and use the script: keyword with multiples files ?

I have a snakemake rule where I call a python script with the script keyword:

rule merge_results:
    input:
        [...]
    output:
        path_results
    script:
        "merge_results.py"

However, I would like to also call an R script in this rule, depending on a condition. I wanted to do something like this:

rule merge_results:
    input:
        [...]
    output:
        path_results
    script:
        "merge_results.py"
        if cond:
            "my_script.R"

but Snakemake won't allow it. I know I could call my scripts by using the run and shell() keywords, but my python and R scripts use a lot of times the snakemake.input variable, so I would have to change a lot of things in their code in order to call them in a different way. Do you know if I can avoid that and use the script: keyword with multiples files ?

Share Improve this question asked Mar 20 at 16:35 KiffikiffeKiffikiffe 1318 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 2

It is not possible to run multiple scripts using the script directive.

I think you should reconsider your approach. Do these scripts depend on each other? As in, does one need to run before the other? Do they write to the same output file, or do they generate separate results? Would it be possible to split this into two separate rules and define dependencies between them? You could use a 'flag file' for this, i.e. let one rule create an empty output file that is marked as temp() that is then used as input for the other rule. Or if the scripts do in fact produce different output files, you can use a target rule that aggregrates these files by listing them in the input directive.

If you are able to split this into two rules, you could use the param directive to compute the condition for the if statement, and then within my_script.R check on I believe snakemake@config['myParam'] to determine if you want to run the logic in the script or not. However, since Snakemake does expect to find an output file from the rule after execution completes, you would have to create an empty file when the if condition evaluated to false.

If you are unable to split this into multiple rules, one way to go about this would be to use the run directive in combination with shell(), and providing input and output as command line arguments:

rule merge_results:
    input:
        [...]
    output:
        path_results
    run:
        shell("python3 merge_results.py -i {input} -o {output}")
        if cond:
            shell("Rscript my_script.R -i {input} -o {output}")

To make this work in python scripts you will need argparse, or if you are fine with less flexibility then you could ommit the -i and -o flags and use sys.argv to get the arguments instead. For R scripts, you can use optparse to get the input using flags, or commandArgs for use without flags. Either way, you will have to make edits to your scripts to remove the snakemake.input references and replace them with the arguments provided from the command line.

发布评论

评论列表(0)

  1. 暂无评论