![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
bwa samse segmentation fault | xguo | Bioinformatics | 78 | 05-03-2013 11:31 AM |
BWA - samse | giverny | Bioinformatics | 6 | 07-01-2010 07:09 AM |
Why no multithreading for BWA sampe/samse? | krobison | Bioinformatics | 5 | 02-20-2010 11:14 AM |
BWA samse -n option | kellyv | Bioinformatics | 0 | 01-19-2010 12:15 PM |
BWA samse results | combiochem | Bioinformatics | 1 | 11-16-2009 11:22 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Hungary Join Date: Mar 2011
Posts: 7
|
![]()
Hi all!
I've done an alignment job using BWA, and as a result, I've got myself an SAM file. I read through the SAM format specification on samtools.sourceforge.net, I also checked out the example sam file, that comes with the example library of Samtools. However, my SAM file seems to be different from what it should be ( or i'm just stupid, which is highly likely). First there is the header part, that seems to be alright. The alignment part is, where is gets messy, it looks like this: NG-5232_4_1_1033_2620#0 4 * 0 0 * * 0 0 CGTTACGGTGTCGGTCTCGTAGAGATATGAACCCTCGTCCCCATGGATTCATGCCAGTTCGTTTATCGCTCGGCATACCTCGCATTCCGTCCTCTGTATTANNNNNNN ).,33<B>A<AAAAAAAAA@@=84@################################################################################### So basically, In the first line, I get the FastQ shortread identifier, after that an 8 character long code, that, as far as I can tell, only includes 4-s, 0-s and *-s in every case. Then the second line consists of the sequence, that was supposed to be aligned, and then in the third line the Phred scores. And all I got is 3 such lines for every sequence. Can you guys tell me, how can I interpret this result, and what may be the cause of me not getting the standard 11 mandatory fields per alignment output, that the format specification mentions? Thanks, Attila Last edited by attilav; 12-21-2011 at 11:59 AM. |
![]() |
![]() |
![]() |
#2 |
Member
Location: Wisconsin Join Date: Jun 2011
Posts: 87
|
![]()
Attilav,
if I am reading this right, then everything except the quality of the read is fine here. A sam file is basically tab delimited and it seems like you have a tab-delimited file. NG-5232_4_1_1033_2620#0 - I think would be the read name (first column), confirm it in the Fastq file. Everything afterwards represents a different column according to the sam header. BWA lists only one record for each read (sequence) and that's why "All you got were 3 such lines for every sequence". ![]() I think your alignment worked just fine. The only thing I would be worried about is the quality of the bases in this particular read. A # represents a q-score of 2 which is really low and in the case of this read almost 75% of the read has q-score "2" bases. I hope this helps. Praful |
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: San Diego Join Date: May 2008
Posts: 912
|
![]()
I don't think there's anythign wrong. You did single end alignment. That second column can only have 3 different outputs in single end data: 0, 4, and 16. The line you posted has a 4, meaning it didn't align anywhere, and with no paired end mate, there's nothing more a .sam file can say about an unmapped read.
Some of those other columns have information about where the mate aligned, but since you have no mates, of course those will be empty. |
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: bethesda Join Date: Feb 2009
Posts: 700
|
![]()
I think your editor is breaking long lines for you. The output (if on one line) looks good. Do a "head -1" from the command line.
|
![]() |
![]() |
![]() |
Tags |
bwa, sam |
Thread Tools | |
|
|