Hi,
I'm working on a genome assembly project from 454 paired end reads which are in sff format. I have already tried Newbler which works with sff files directly, however now I want to try other things as well, so I need to have reeds in a user friendly format like fasta.
I used sff_extract whith -l and -c options to remove linkers and clip the ends. I decided to verify the resulting fasta file and assembled it with Velvet as single end data, however the results were very strange: N50 was about 15-30 bps. This is very strange given that the coverage is decent (~30) and single end reads assembly by Newbler was good. So I assume that maybe the extraction process went wrong and something else has to be clipped from the sequences? Any suggestions how to do it? Or did I do something wrong?
I was thinking to map the resulting fasta reads to Newbler contigs to see what is wrong with the reads. But I have not worked with alignment tools yet. I would appreciate any suggestions on how to align reads to contigs (preferably with visualization).
Thanks
I'm working on a genome assembly project from 454 paired end reads which are in sff format. I have already tried Newbler which works with sff files directly, however now I want to try other things as well, so I need to have reeds in a user friendly format like fasta.
I used sff_extract whith -l and -c options to remove linkers and clip the ends. I decided to verify the resulting fasta file and assembled it with Velvet as single end data, however the results were very strange: N50 was about 15-30 bps. This is very strange given that the coverage is decent (~30) and single end reads assembly by Newbler was good. So I assume that maybe the extraction process went wrong and something else has to be clipped from the sequences? Any suggestions how to do it? Or did I do something wrong?
I was thinking to map the resulting fasta reads to Newbler contigs to see what is wrong with the reads. But I have not worked with alignment tools yet. I would appreciate any suggestions on how to align reads to contigs (preferably with visualization).
Thanks
Comment