Seqanswers Leaderboard Ad

**dpryan** · 01-20-2014, 03:57 AM

You have a couple problems. Firstly the values inside "coor" are strings, not ranges. So trying to use them directly as ranges won't work. You could try:

Code:

coors = [[1256617, 1311411], [1973169, 2005648]]
for bounds in coors :
    sub_record = record[bounds[0]:bounds[1]]

and that would likely work. Of course, then you run into the problem that the coordinates you gave are beyond the end of the sequence you retrieved. Also, you ask for two records and then overwrite the first with the second. I presume you want the "foo j in coor :" loop inside the "for i in indent :" loop.

**CrLs** · 01-20-2014, 04:55 AM

Yeah you are right, i want to loop "for bounds in coors" inside the "for i in ident", i ll try your suggestion then tell you if it works. Anyway, i thank you for you time !

Edit : i want to slice the first genome with locations one, then slice second genome with the second locations ( later it will be 100 genome and 100 locations )

Anyway i'll try with your suggestion and come back later !

**dpryan** · 01-20-2014, 05:06 AM

Ah, then just get rid of the "for j in coor" loop, since you're already setting the index for coor if you nest that within the "for i in indent" loop.

**CrLs** · 01-20-2014, 05:16 AM

well, at the moment it give me back this error :
TypeError: slice indices must be integers or None or have an __index__ method
Should i change my coor for something else ?
And about to remove the "for j in coor loop" , how can i nest that with the "for i in ident" loop ? something like for i in ident and for bounds in coors: ?

Again, thanks a lot for your answer !

Edit : i changed my coor, i forgot to put the '[ ]', my bad !

**maubp** · 01-20-2014, 07:28 AM

Watch out for different counting conventions when you do the slicing...

Also, you could ask the NCBI to pre-slice the records when you call Entrez.efetch by including the optional seq_start and seq_end arguments, see: http://www.ncbi.nlm.nih.gov/books/NB...hapter4.EFetch

**CrLs** · 01-20-2014, 07:56 AM

Hello
Yep thank you, i ll check it !
Hmm, to use optionals arguments , i should put 3 loops ? one with genome, one with start and one with stop right ?( i want to slice the first genome with the first location, second genome with 2 location ect )

**maubp** · 01-20-2014, 08:01 AM

I would use ONE loop, something like this:

Code:

from Bio import Entrez, SeqIO
Entrez.email = "[email protected]"
for i, start, end in [('AE009948', 1256617, 1311411),
                      ('AE009947', 1973169, 2005648)]:
    print("Fetching %s:%i-%i now..." % (i, start, end))
    #code here using Entrez.efetch(...)

**CrLs** · 01-20-2014, 08:09 AM

Well, thanks you for your answer, i'll try your way and the old way, i ll keep the faster ! ( i dont know if one take more memory than the other )
Anyway, Thanks you a lot ! I come back with a working code when i'm done with it

**CrLs** · 01-20-2014, 08:52 AM

Ok Peter and Ryan thanks you for your help !
This is the working code, get you all the product ( or everything else you need ) between the location you want

Code:

>>> for i, start, end in [('AE009948', 1256617, 1311411),
                      ('AE009948', 1973169, 2005648)]:
	handle = Entrez.efetch(db="nucleotide", id=i, seq_start=start,
            seq_stop=end, rettype="gb")
	results2 = open('resultsRegion_note.csv', 'a')
	for seq_record in SeqIO.parse(handle, "gb"):
		results2.write('\n')
	for feature in seq_record.features:
			if feature.type=="CDS":
				results2.write(str(feature.qualifiers.get('product'))[1:-1])
	results2.close()

feel free to use (even if i think a lot of people can do the same

)

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 11 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Slicing genbank file using biopython [problem]

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News