View Single Post
Old 07-27-2011, 07:10 AM   #4
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,177
Default

There is a potential problem with the output you requested and produced by the two examples given above. The sequence identifiers are not unique; every sub-contig within scaffold1 will be named "scaffold1" in the resulting FASTA file. Remember that only the first "word" of the definition line is considered the sequence ID and this should be unique for every entry in your file. Adjust the example scripts so that each contig gets a unique name like:

Code:
>scaffold1.1
.......
>scaffold1.2
.......
>scaffold2.1
.......

etc.
kmcarr is offline   Reply With Quote