Hello
I recently prepped a paired end directional rna-seq library using epicentre's script-seq kit.
The way they do it is that read1 will always be sense to the read and read2 will always be antisense to the read.
My question to you is: how does an algorithm (TopHat, GSNAP, BWA, etc) determine strandedness given this information?
I would think that the strandedness can be determined by read1, is this correct?
When I look at my bam file (sam output below) I do not see anything that would assign strandedness to the read.
The 9th column is not an accurate read out (that I think) because according to samtools manual: The leftmost segment has a plus sign and the rightmost has a minus sign. The sign of segments in the middle is un-defined. Obviously this would be false in the instance of a transcript that would be transcribing from right to left.
Thanks for any help.
I recently prepped a paired end directional rna-seq library using epicentre's script-seq kit.
The way they do it is that read1 will always be sense to the read and read2 will always be antisense to the read.
My question to you is: how does an algorithm (TopHat, GSNAP, BWA, etc) determine strandedness given this information?
I would think that the strandedness can be determined by read1, is this correct?
When I look at my bam file (sam output below) I do not see anything that would assign strandedness to the read.
HTML Code:
D5N1JJN1:93:D09AFACXX:1:2105:1716:91336 163 chr3 131045 40 2S97M = 131117 173 CTCCTGAATTCTTTCTTGCATAAGATCCAAGAACCCTCTTTTGGAGTCTGAATTAGGACCCCTTTCCTGCAACACCTATGCCATGCAAAGTTAACAACC CCFFFFFHHHHHJJJJJJJJJJJJIJJJJJJEGHHJJJJJJJJJGIIJJGHIIJJJJJJJJJIJIJIJJJJHHHHFFFFCCEEEEDCDDCDDEEDDDDD MD:Z:97 NH:i:1 NM:i:0 SM:i:40 D5N1JJN1:93:D09AFACXX:1:2105:1716:91336 83 chr3 131117 40 99M = 131045 -173 CCTATGCCATGCAAAGTTAACAACCCACATACTGTGGATTAGGATATGGGTTGCCCCCCTTTGAAATATGGGGTCATTATTTTGCCTGCCACACTGCCC DDC@ADDDEEDDDEDDEDDDBDDDDEEEEEDDDDDDDDDDEEDDCCDDDDDDDDJJJJJIHFFJIHHGEJJJJJIIJIIJGJIIJJIHHHHHFFFFFCC MD:Z:93G5 NH:i:1 NM:i:1 SM:i:40
Thanks for any help.
Comment