SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
bwa samse segmentation fault xguo Bioinformatics 78 05-03-2013 10:31 AM
BWA - samse giverny Bioinformatics 6 07-01-2010 06:09 AM
Why no multithreading for BWA sampe/samse? krobison Bioinformatics 5 02-20-2010 10:14 AM
BWA samse -n option kellyv Bioinformatics 0 01-19-2010 11:15 AM
BWA samse results combiochem Bioinformatics 1 11-16-2009 10:22 AM

Reply
 
Thread Tools
Old 12-21-2011, 10:32 AM   #1
attilav
Junior Member
 
Location: Hungary

Join Date: Mar 2011
Posts: 7
Default weird BWA SAM (samse) output

Hi all!

I've done an alignment job using BWA, and as a result, I've got myself an SAM file.
I read through the SAM format specification on samtools.sourceforge.net, I also checked out the example sam file, that comes with the example library of Samtools.
However, my SAM file seems to be different from what it should be ( or i'm just stupid, which is highly likely).
First there is the header part, that seems to be alright. The alignment part is, where is gets messy, it looks like this:

NG-5232_4_1_1033_2620#0 4 * 0 0 * * 0 0 CGTTACGGTGTCGGTCTCGTAGAGATATGAACCCTCGTCCCCATGGATTCATGCCAGTTCGTTTATCGCTCGGCATACCTCGCATTCCGTCCTCTGTATTANNNNNNN ).,33<B>A<AAAAAAAAA@@=84@###################################################################################

So basically, In the first line, I get the FastQ shortread identifier, after that an 8 character long code, that, as far as I can tell, only includes 4-s, 0-s and *-s in every case. Then the second line consists of the sequence, that was supposed to be aligned, and then in the third line the Phred scores.
And all I got is 3 such lines for every sequence.

Can you guys tell me, how can I interpret this result, and what may be the cause of me not getting the standard 11 mandatory fields per alignment output, that the format specification mentions?

Thanks, Attila

Last edited by attilav; 12-21-2011 at 10:59 AM.
attilav is offline   Reply With Quote
Old 12-21-2011, 11:33 AM   #2
aggp11
Member
 
Location: Wisconsin

Join Date: Jun 2011
Posts: 87
Default

Attilav,

if I am reading this right, then everything except the quality of the read is fine here.

A sam file is basically tab delimited and it seems like you have a tab-delimited file.

NG-5232_4_1_1033_2620#0 - I think would be the read name (first column), confirm it in the Fastq file. Everything afterwards represents a different column according to the sam header.

BWA lists only one record for each read (sequence) and that's why "All you got were 3 such lines for every sequence".

I think your alignment worked just fine.

The only thing I would be worried about is the quality of the bases in this particular read. A # represents a q-score of 2 which is really low and in the case of this read almost 75% of the read has q-score "2" bases.

I hope this helps.

Praful
aggp11 is offline   Reply With Quote
Old 12-21-2011, 03:28 PM   #3
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

I don't think there's anythign wrong. You did single end alignment. That second column can only have 3 different outputs in single end data: 0, 4, and 16. The line you posted has a 4, meaning it didn't align anywhere, and with no paired end mate, there's nothing more a .sam file can say about an unmapped read.

Some of those other columns have information about where the mate aligned, but since you have no mates, of course those will be empty.
swbarnes2 is offline   Reply With Quote
Old 12-21-2011, 04:15 PM   #4
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

I think your editor is breaking long lines for you. The output (if on one line) looks good. Do a "head -1" from the command line.
Richard Finney is offline   Reply With Quote
Reply

Tags
bwa, sam

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:51 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO