Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23131

Snakemake: Execute rule based on function with wildcards

$
0
0

I am making a pipeline with snakemake, and I use a dataframe as input just like this:

sample,fastq_1,fastq_2,replicate,paired_end
A1,A1_R1.fq.gz,A1_R2.fq.gz,1,True
A2,A2.fq.gz,,1,False

In the pipeline, I set sample as wildcards:

samples = pd.read_csv('config/samplesheet_valid.csv', sep=",").set_index("sample", drop=False)wildcard_constraints:    sample_id = "|".join(samples.index)

and use a wildcards function to choose execute which run:

def get_paired(wildcards):    if samples.loc[wildcards.sample_id, "paired_end"]:        return True    else:        return Falseif get_paired:    rule name_sort_bam_paired:        input:            bam_file = '{BASE_FOLDER}/bowtie2/filter/{sample_id}.flT.sorted.bam',        output:            sorted_bam = temp('{BASE_FOLDER}/bowtie2/filter/{sample_id}.clN.name.sorted.bam'),            out_bam = temp('{BASE_FOLDER}/bowtie2/filter/{sample_id}.clN.bam'),            out_sort_bam = '{BASE_FOLDER}/bowtie2/filter/{sample_id}.clN.sorted.bam'        params:            samtools = config['tools']['samtools'],            bampe_rm_orphan = 'scripts/bampe_rm_orphan.py',            python = config['tools']['python']        threads: lambda wildcards: 10 if 10 < config['threads'] else config['threads']        shell:"""            {params.samtools} sort -n --threads {threads} \            -o {output.sorted_bam} \            -T {wildcards.BASE_FOLDER}/bowtie2/filter/{wildcards.sample_id}.mLb.clN.name.sorted \            {input.bam_file}            {params.python} {params.bampe_rm_orphan} \            {output.sorted_bam} {output.out_bam} --only_fr_pairs            {params.samtools} sort --threads {threads} \            -o {output.out_sort_bam} \            -T {wildcards.BASE_FOLDER}/bowtie2/filter/{wildcards.sample_id}.clN.sorted \            {output.out_bam}"""else:    rule name_sort_bam_singled:        input:            bam_file = '{BASE_FOLDER}/bowtie2/filter/{sample_id}.flT.sorted.bam',        output:            out_bam = temp('{BASE_FOLDER}/bowtie2/filter/{sample_id}.clN.bam'),            out_sort_bam = '{BASE_FOLDER}/bowtie2/filter/{sample_id}.clN.sorted.bam'        params:            samtools = config['tools']['samtools']        threads: lambda wildcards: 10 if 10 < config['threads'] else config['threads']        shell:"""            cp {input.bam_file} {output.out_bam}            {params.samtools} sort --threads {threads} \            -o {output.sorted_bam} \            -T {wildcards.BASE_FOLDER}/bowtie2/filter/{wildcards.sample_id}.clN.sorted \            {input.out_bam}"""

But when I run this snkameake pipeline, all samples only run name_sort_bam_paired.But I want to run name_sort_bam_paired only when get_paired return True, and run name_sort_bam_singled when get_paired return False. How can I solution the problem? And I also want to know what causes this.


Viewing all articles
Browse latest Browse all 23131

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>