I am making a pipeline with snakemake, and I use a dataframe as input just like this:
sample,fastq_1,fastq_2,replicate,paired_end
A1,A1_R1.fq.gz,A1_R2.fq.gz,1,True
A2,A2.fq.gz,,1,False
In the pipeline, I set sample as wildcards:
samples = pd.read_csv('config/samplesheet_valid.csv', sep=",").set_index("sample", drop=False)wildcard_constraints: sample_id = "|".join(samples.index)and use a wildcards function to choose execute which run:
def get_paired(wildcards): if samples.loc[wildcards.sample_id, "paired_end"]: return True else: return Falseif get_paired: rule name_sort_bam_paired: input: bam_file = '{BASE_FOLDER}/bowtie2/filter/{sample_id}.flT.sorted.bam', output: sorted_bam = temp('{BASE_FOLDER}/bowtie2/filter/{sample_id}.clN.name.sorted.bam'), out_bam = temp('{BASE_FOLDER}/bowtie2/filter/{sample_id}.clN.bam'), out_sort_bam = '{BASE_FOLDER}/bowtie2/filter/{sample_id}.clN.sorted.bam' params: samtools = config['tools']['samtools'], bampe_rm_orphan = 'scripts/bampe_rm_orphan.py', python = config['tools']['python'] threads: lambda wildcards: 10 if 10 < config['threads'] else config['threads'] shell:""" {params.samtools} sort -n --threads {threads} \ -o {output.sorted_bam} \ -T {wildcards.BASE_FOLDER}/bowtie2/filter/{wildcards.sample_id}.mLb.clN.name.sorted \ {input.bam_file} {params.python} {params.bampe_rm_orphan} \ {output.sorted_bam} {output.out_bam} --only_fr_pairs {params.samtools} sort --threads {threads} \ -o {output.out_sort_bam} \ -T {wildcards.BASE_FOLDER}/bowtie2/filter/{wildcards.sample_id}.clN.sorted \ {output.out_bam}"""else: rule name_sort_bam_singled: input: bam_file = '{BASE_FOLDER}/bowtie2/filter/{sample_id}.flT.sorted.bam', output: out_bam = temp('{BASE_FOLDER}/bowtie2/filter/{sample_id}.clN.bam'), out_sort_bam = '{BASE_FOLDER}/bowtie2/filter/{sample_id}.clN.sorted.bam' params: samtools = config['tools']['samtools'] threads: lambda wildcards: 10 if 10 < config['threads'] else config['threads'] shell:""" cp {input.bam_file} {output.out_bam} {params.samtools} sort --threads {threads} \ -o {output.sorted_bam} \ -T {wildcards.BASE_FOLDER}/bowtie2/filter/{wildcards.sample_id}.clN.sorted \ {input.out_bam}"""But when I run this snkameake pipeline, all samples only run name_sort_bam_paired.But I want to run name_sort_bam_paired only when get_paired return True, and run name_sort_bam_singled when get_paired return False. How can I solution the problem? And I also want to know what causes this.