Hi all!
I've done an alignment job using BWA, and as a result, I've got myself an SAM file.
I read through the SAM format specification on samtools.sourceforge.net, I also checked out the example sam file, that comes with the example library of Samtools.
However, my SAM file seems to be different from what it should be ( or i'm just stupid, which is highly likely).
First there is the header part, that seems to be alright. The alignment part is, where is gets messy, it looks like this:
NG-5232_4_1_1033_2620#0 4 * 0 0 * * 0 0 CGTTACGGTGTCGGTCTCGTAGAGATATGAACCCTCGTCCCCATGGATTCATGCCAGTTCGTTTATCGCTCGGCATACCTCGCATTCCGTCCTCTGTATTANNNNNNN ).,33<B>A<AAAAAAAAA@@=84@###################################################################################
So basically, In the first line, I get the FastQ shortread identifier, after that an 8 character long code, that, as far as I can tell, only includes 4-s, 0-s and *-s in every case. Then the second line consists of the sequence, that was supposed to be aligned, and then in the third line the Phred scores.
And all I got is 3 such lines for every sequence.
Can you guys tell me, how can I interpret this result, and what may be the cause of me not getting the standard 11 mandatory fields per alignment output, that the format specification mentions?
Thanks, Attila
I've done an alignment job using BWA, and as a result, I've got myself an SAM file.
I read through the SAM format specification on samtools.sourceforge.net, I also checked out the example sam file, that comes with the example library of Samtools.
However, my SAM file seems to be different from what it should be ( or i'm just stupid, which is highly likely).
First there is the header part, that seems to be alright. The alignment part is, where is gets messy, it looks like this:
NG-5232_4_1_1033_2620#0 4 * 0 0 * * 0 0 CGTTACGGTGTCGGTCTCGTAGAGATATGAACCCTCGTCCCCATGGATTCATGCCAGTTCGTTTATCGCTCGGCATACCTCGCATTCCGTCCTCTGTATTANNNNNNN ).,33<B>A<AAAAAAAAA@@=84@###################################################################################
So basically, In the first line, I get the FastQ shortread identifier, after that an 8 character long code, that, as far as I can tell, only includes 4-s, 0-s and *-s in every case. Then the second line consists of the sequence, that was supposed to be aligned, and then in the third line the Phred scores.
And all I got is 3 such lines for every sequence.
Can you guys tell me, how can I interpret this result, and what may be the cause of me not getting the standard 11 mandatory fields per alignment output, that the format specification mentions?
Thanks, Attila
Comment