![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Direction/Orientation of Illumina read in SAM file | flobpf | Bioinformatics | 3 | 11-12-2013 04:06 AM |
bwa mapped, interesting SAM output | mo_hit4u | Bioinformatics | 1 | 11-20-2012 03:20 PM |
weird BWA SAM (samse) output | attilav | Bioinformatics | 3 | 12-21-2011 05:15 PM |
sam output from bwa for SOLiD reads in colorspace? | nisha | SOLiD | 19 | 01-07-2010 05:05 AM |
sam output from bwa colorspace alignment | Mr Mutundes | Bioinformatics | 0 | 12-15-2009 04:02 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Tulsa, OK Join Date: Feb 2017
Posts: 3
|
![]()
I've tried two different header styles in my input FASTQ headers when running BWA:
@SN7001163:162:C4A1UACXX:1:1101:1062:2076/1 and @SN7001163:162:C4A1UACXX:1:1101:1062:2076 1:N:0:GTCCGCA My goal is to be able to tell which mate I'm looking at in the FASTQ file, but it seems to get stripped in the SAM output, where from "bwa sampe" I get lines like this: Code:
SN7001163:162:C4A1UACXX:1:1101:1062:2076 77 * 0 0 * * 0 0 GTTTGCTTGGCTGTGAGCTTGTCCGACACGGGCCACCAGGAGAGTGAGATACACCGAGACGAGCATCCTGTCTTTCTCTCGGACGGTTCCACAACAAATAA @?@DDD?;<;F>?<2A<E<FFC9:FE8):8@?FFF@FF=;=;D;).).7>77==EB'93;;3=@@:@(:3,+(4::@B>@5?-<@B<?<34>ABB1<8:43 SN7001163:162:C4A1UACXX:1:1101:1062:2076 141 * 0 0 * * 0 0 GCCATGTTGAGTGAGAATTTATTATTTGTTGTGGAACC ;<;;(42@9)@)84):46=69416)2@:@:<=1(66@? |
![]() |
![]() |
![]() |
#2 |
Junior Member
Location: Tulsa, OK Join Date: Feb 2017
Posts: 3
|
![]()
I realize that those two reads didn't actually align, so the SAM lines were pretty minimal. Here are a pair which did:
Code:
SN7001163:162:C4A1UACXX:1:1101:1174:2116 81 Locus_14841_Transcript_1__1_Confidence_0.750_Length_603 292 37 101M = 294 -99 CTCGTCATTTCAATGCCCCCTCTCATATCAGAAGGAAAATCATGAGTGCTCCTTTGTCAAAAGAGCTGAGAGCAAAGTACAATGTGAGAAGTATGCCCATT >BBDDDDDDDDDDDBDFFHHHHIIHJJJJJJJJJJJJJJIIJJJJJJJJJJJJJJIIIJJJJIJJJJJJJIJJJJJJJHJJJIJJJJJHHHHHFFFFFCCC XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:101 SN7001163:162:C4A1UACXX:1:1101:1174:2116 161 Locus_14841_Transcript_1__1_Confidence_0.750_Length_603 294 37 101M = 292 99 CGTCATTTCAATGCCCCCTCTCATATCAGAAGGAAAATCATGAGTGCTCCTTTGTCAAAAGAGCTGAGAGCAAAGTACAATGTGAGAAGTATGCCCATTAG BCBFFFFFHHHH?HIJJJJJJJJJJJJJJJJJJJJJJJJJJJJJHIJJJJGHIIHHIJJJJJJJJJIJJJJJHHHHHHFFFFFFFEEEEEEEEDDDDDDDC XT:A:U NM:i:1 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:99C1 |
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: Germany Join Date: Apr 2012
Posts: 215
|
![]()
[original post deleted because I misunderstood the question]
gingers answer below is correct. You can simply confirm this by swapping R1 and R2 reads. Last edited by WhatsOEver; 05-11-2017 at 01:36 AM. |
![]() |
![]() |
![]() |
#4 |
David Eccles (gringer)
Location: Wellington, New Zealand Join Date: May 2011
Posts: 838
|
![]()
There are flags in the SAM file for the first (and last) read attached to a template sequence. If a bitwise and of the flag field with 0x40 returns non-zero, then it is the first read of a template sequence. In the case of the two examples you have, here is the full flag breakdown:
Code:
81 = 0101 0001 Paired Reverse-complemented First read in the template 161 = 1010 0001 Paired Other read is reverse-complemented Last read in the template These flags can be filtered using samtools view: Code:
samtools view -b -f 0x40 in.bam > out_FirstRead.bam samtools view -b -F 0x40 in.bam > out_notFirst.bam Whether or not a read is first or last is particularly important for strand-specific sequencing, because it allows you to distinguish between templates that are oriented in the same direction as the primary transcript, and those that are not (e.g. siRNA). Last edited by gringer; 05-11-2017 at 01:14 AM. |
![]() |
![]() |
![]() |
Tags |
bwa, sam |
Thread Tools | |
|
|