Seqanswers Leaderboard Ad

**Brian Bushnell** · 07-23-2014, 12:03 PM

Try reverse-complimenting the reads prior to mapping.

....just kidding! Actually, assuming you are using a stranded protocol, the strand reads map to is NOT affected by the library type flag you give Tophat. That only affects downstream processing using Cufflinks/Cuffdif. One of the library types is supposed to have read1 mapping to the 'wrong' strand.

On the other hand, if your protocol was unstranded, it doesn't matter either way. The strand bias in that case is probably an artifact of the number of PCR cycles or some kind of 3'/5' binding affinity difference (just a guess).

**rdsqc22** · 07-23-2014, 01:33 PM

Originally posted by Brian Bushnell View Post

Actually, assuming you are using a stranded protocol, the strand reads map to is NOT affected by the library type flag you give Tophat. That only affects downstream processing using Cufflinks/Cuffdif. One of the library types is supposed to have read1 mapping to the 'wrong' strand.

On the other hand, if your protocol was unstranded, it doesn't matter either way. The strand bias in that case is probably an artifact of the number of PCR cycles or some kind of 3'/5' binding affinity difference (just a guess).

This makes sense- however, according to the manufacturer documentation for the sequencing platform (Illumina GA IIx) it claims to be strand-specific. So, I would have to look into the Cuffdiff results to see if I am indeed seeing most of my reads discarded for most genes, or mostly looked at, or all considered regardless of strand?

Based on my understanding, for stranded data, firststrand means that the read that comes out is equivalent to the original mRNA, and therefore will map to the opposite strand from the gene's location (as I am seeing in my data), whereas secondstrand means that the complement to the cDNA is sequenced, and the read is equivalent to the original gene, which is where it maps on the genome.

Would I be correct, then, to think that this data is probably a firststrand library, which will be clear once I run cuffdiff (on the data I generated from aligning with the firststrand argument in tophat)?

**Brian Bushnell** · 07-23-2014, 01:37 PM

I don't do library prep, but my understanding is that machines are not inherently strand-specific; rather, some machines offer the possibility of using a strand-specific protocol. That does not ensure that your specific library was, in fact, sequenced using a strand-specific protocol; you'd have to check with the people who made it.

Just to be clear, is your data single-ended or paired?

**rdsqc22** · 07-23-2014, 01:42 PM

It is single-end.
I just checked the protocol accompanying the data- it confirms that the reads are indeed strand-specific.

By the way, thank you so much for all of your help so far.

**Brian Bushnell** · 07-23-2014, 02:43 PM

Originally posted by rdsqc22 View Post

It is single-end.
I just checked the protocol accompanying the data- it confirms that the reads are indeed strand-specific.

By the way, thank you so much for all of your help so far.

You're welcome. As for 'firststrand' vs 'secondstrand', the documentation in Tophat is confusing, but I eventually concluded that for firststrand, read1 gets the sam tag "XS:A:+" if it maps to the plus strand and "XS:A:-" if it maps to the minus strand. This gives results concordant with Tophat, anyway, so I consider it empirically correct. According to the Tophat manual:

Note the use of the custom tag XS. This attribute, which must have a value of "+" or "-", indicates which strand the RNA that produced this read came from.

So... with 'firststrand', a plus-mapped read will get "XS:A:+", which by my reading indicates that its template RNA was minus strand, which indicates the gene is on the plus strand. But the description is vague so I'm not sure.

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

My reads are mapping to the wrong strand?

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News