Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23131

How to iterate a list created with SeqIO using re.findall, FASTA-input

$
0
0

I have spent way too much time on this now (+10 hours).

Input is a file in fasta-format.Output should be a text-file containing the gene-ID and the matched patterns (three different patterns)

I wanted to make my own function to avoid writing the same code three times, but I've given up now and just written it three times (and works fine).

Is there a way to use this:records = list(SeqIO.parse('mytextfile.fasta', 'fasta'))instead of the code that I'm currently using three times (down below) or some other function? It's for a school assignment so it shouldn't be too complicated either but I have to use the Bio and re-module to solve it.

from Bio import SeqIOimport reoutfile = 'sekvenser.txt'for seq_record in SeqIO.parse('prot_sequences.fasta', 'fasta'):    match = re.findall(r'W.P', str(seq_record.seq), re.I)    if match:               with open(outfile, 'a') as f:            record_string = str(seq_record.id)            newmatch = str(match)            result = record_string+'\t'+newmatch            print(result)            f.write(result +'\n')

I've tried this

records = list(SeqIO.parse('prot_sequences.fasta', 'fasta'))new_list = []i = r'W.P'for i in records:    match = re.findall(i)    if match:               new_list.append(match)print(new_list)

But it only gives me that findall() is missing 1 required positional argument: 'string'.

As I can see it, i is a string (as I made the variable). Obviously I'm doing something wrong. If I try to insert seq_record that I'm using in my other code, it tells me that seq_record isn't defined. I don't understand what I'm supposed to put after the i in the code.


Viewing all articles
Browse latest Browse all 23131

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>