Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • kmcarr
    Senior Member
    • May 2008
    • 1181

    #31
    Originally posted by billstevens View Post
    I'm sorry, I just read this entire thread, and I still have no idea if my data is strand-specific or not. The core I send it to uses the old TruSeq RNA kit, http://epigenome.usc.edu/docs/resour...15008136_A.pdf
    Bill,

    The TruSeq RNA protocol is NOT directional. The appropriate --library-type option for these libraries is fr-unstranded (which is the default for TopHat).

    Comment

    • billstevens
      Senior Member
      • Mar 2012
      • 120

      #32
      Oh wow, thank you! So that means I needed to use the --stranded=no option using HTSeq? Correct?

      Comment

      • kmcarr
        Senior Member
        • May 2008
        • 1181

        #33
        Originally posted by billstevens View Post
        Oh wow, thank you! So that means I needed to use the --stranded=no option using HTSeq? Correct?
        That is correct.

        Comment

        • carmeyeii
          Senior Member
          • Mar 2011
          • 137

          #34
          Hello!

          I am analyzing a dataset which, from the Methods section, appears to be directional:

          RNA-seq
          libraries were constructed using Illumina (San Diego, CA) mRNA sequencing kits. Total RNA was subjected to two rounds of oligo- dT purification and then chemically fragmented to approximately 200 bases. Fragmented RNA was used for first-strand cDNA synthesis using random primers and SuperScript II. The second strand was then synthesized using RNaseH and DNA Pol I.
          It was generated on the GaxII.

          After reading this post, I think it is safe to say that the reads on the fastq file correspond to the mRNA molecules that originated them, and therefore, to the coding strand of gene X in the genome as well. Is this correct?

          After revising the library options for TopHat and Cufflinks, would you agree that the appropiate option for TopHat would be --library-type fr secondstrand? And that in Cufflinks I should also indicate that this is a second-stranded library?

          Thanks very much!

          Carmen

          Comment

          • amitm
            Member
            • Feb 2011
            • 52

            #35
            Originally posted by carmeyeii View Post
            After reading this post, I think it is safe to say that the reads on the fastq file correspond to the mRNA molecules that originated them, and therefore, to the coding strand of gene X in the genome as well. Is this correct?

            After revising the library options for TopHat and Cufflinks, would you agree that the appropiate option for TopHat would be --library-type fr secondstrand? And that in Cufflinks I should also indicate that this is a second-stranded library?
            hello carmen,
            You are right about the first part.
            Though the method "part" you have posted doesn't appear to be for a strand-specific RNA-Sequencing. If it is Strand-specific then the protocol used for generating the library is mentioned. Like whether its the dUTP method or the Illumina strand-specific protocol.
            Look here for all such protocols - http://www.nature.com/nmeth/journal/...meth.1491.html

            and check whether any of such is mentioned in the Methods or Supple Info.

            Comment

            • carmeyeii
              Senior Member
              • Mar 2011
              • 137

              #36
              Thank you amitm.

              After re-reading it is now clear that they did not use any strand-specific protocol.

              Thanks for your help!

              Carmen

              Comment

              • NicoBxl
                not just another member
                • Aug 2010
                • 264

                #37
                Little up for this interessting post.

                I've a problem with my strand-specific data and htseq-count. So I aligned my data (2x50bp - dUTP method) with STAR. After that I extracted the reads with htseq-count :

                htseq-count -s yes gtf.gtf data.sam > htseq.txt

                But I've only a read count of 9 for the gene beside . And there is a lot of other genes with very low gene count.

                With -s no, the read count seems ok.

                Here are a read (and its pair) in sam format that are aligning on this gene (cf figure below)

                Code:
                HWI-ST1172:65:C0RN7ACXX:1:2316:4226:51105	99	chr15	44109457	255	51M	=	44109544	136	TGTAAACGCCGTAGCCGGGGGTCACTGGATGAATCCTCCTCCTGTTCCTCA	CCBFFFFFGHHHHJIJJJJJJ@EIIIJHGFHGFFFFDEEDEEEDCCDDDDD	NH:i:1	HI:i:1	AS:i:98	nM:i:0
                HWI-ST1172:65:C0RN7ACXX:1:2316:4226:51105	147	chr15	44109544	255	49M2S	=	44109457	-136	TGAAATTCTTCATCCTCCTCATCTGAGGACTCCATAGGGGCATAGTCTGCN	EJJJJJIJIJJIJJIIJJJJJJJJJJIGDIIJJJJJJJHHGHHFFDD=4+#	NH:i:1	HI:i:1	AS:i:98	nM:i:0
                So do I have to put -s reverse ? but I don't understand in the gtf file, the gene is encoded on the minus strand and my reads are also aligning on the minus strand. I must miss something..

                Thanks

                N.

                Comment

                • Simon Anders
                  Senior Member
                  • Feb 2010
                  • 995

                  #38
                  Well, let's parse your flag fields.

                  99d = 63h = 0110.0011b means: 1st mate, aligned to plus strand
                  147d = 93h = 1001.0011b means: 2nd mate, aligned to minus strand

                  There you have it. The first mate aligns to the strand opposite to the gene, so you need --stranded=reverse.

                  Comment

                  • NicoBxl
                    not just another member
                    • Aug 2010
                    • 264

                    #39
                    But why IGV is coloring my reads in red (color alignment by > first-of-pair strand) ?

                    Comment

                    • Simon Anders
                      Senior Member
                      • Feb 2010
                      • 995

                      #40
                      How do you know that red means plus and blue means minus? Maybe it's the other way round.

                      Also note that "first-of-pair" probably means that also the second read is coloured according to the orientation of the first mate.

                      Comment

                      • carmeyeii
                        Senior Member
                        • Mar 2011
                        • 137

                        #41
                        Originally posted by flobpf View Post

                        UPDATE June 10, 2011: I contacted Illumina and they were confused too. However, finally they (and people at TopHat and our sequencing center) resolved the issue. The reads that come out of the machine have the same sequence as the CODING strand of the DNA and not the template strand.
                        Is this true?

                        If that is the case, then

                        a) the illustration above should portray the complementary strand to the RNA being used as template to synthesize (and read) the actual mRNA molecule by the sequencer.

                        or

                        b) The chemistry is as portrayed by the ilustration, but the sequencer translates the base fluorescing to the complementary base, in order to output a read that is the actual sequence of the mRNA molecule?



                        C

                        Comment

                        • kmcarr
                          Senior Member
                          • May 2008
                          • 1181

                          #42
                          Originally posted by carmeyeii View Post
                          Quote: Originally Posted by flobpf

                          UPDATE June 10, 2011: I contacted Illumina and they were confused too. However, finally they (and people at TopHat and our sequencing center) resolved the issue. The reads that come out of the machine have the same sequence as the CODING strand of the DNA and not the template strand.
                          Is this true?

                          If that is the case, then

                          a) the illustration above should portray the complementary strand to the RNA being used as template to synthesize (and read) the actual mRNA molecule by the sequencer.

                          or

                          b) The chemistry is as portrayed by the ilustration, but the sequencer translates the base fluorescing to the complementary base, in order to output a read that is the actual sequence of the mRNA molecule?



                          C
                          That information is from two years ago (which is a couple of epochs in NGS time). At that time (I believe) Illumina used a ligation based protocol for directional RNA-Seq. For that library prep protocol the correct --library-type option is fr-secondstrand.

                          Now the Illumina TruSeq stranded RNA-Seq kits use a dUTP second strand marking protocol (like ScriptSeq). The correct option for this protocol is fr-firststrand.

                          You need to make certain you know what library prep protocol was used before trying to interpret strandedness.

                          Comment

                          • carmeyeii
                            Senior Member
                            • Mar 2011
                            • 137

                            #43
                            C

                            Originally posted by kmcarr View Post
                            Now the Illumina TruSeq stranded RNA-Seq kits use a dUTP second strand marking protocol (like ScriptSeq).
                            kcarr, I thought ScriptSeq's directional protocol is actually tag-based?
                            I think it doesn't involver dUTP, does it? -- at least ScriptSeq v2 doesn't?

                            Best,
                            C

                            Comment

                            • kmcarr
                              Senior Member
                              • May 2008
                              • 1181

                              #44
                              Originally posted by carmeyeii View Post
                              kcarr, I thought ScriptSeq's directional protocol is actually tag-based?
                              I think it doesn't involver dUTP, does it? -- at least ScriptSeq v2 doesn't?

                              Best,
                              C
                              You're correct; sorry, brain lock on my part.

                              Comment

                              • carmeyeii
                                Senior Member
                                • Mar 2011
                                • 137

                                #45
                                No prob.
                                OK, so just to make sure I'm on the right track -- the sequence that is "spit out" by the sequencer is the actual sequence as seen by the camera... i.e, no base translation, just raw fluorescence -> letter .

                                So if the first read of a pair is the "first [cDNA] strand", THAT one was the strand synthesized during the first round of sequencing, using as template the strand corresponding to the RNA molecule sequence and direction.

                                C

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                                  by SEQadmin2


                                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                                  Here are nine questions we think about, in roughly the order they matter, before...
                                  06-18-2026, 07:11 AM
                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-17-2026, 06:09 AM
                                0 responses
                                26 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-09-2026, 11:58 AM
                                0 responses
                                43 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-05-2026, 10:09 AM
                                0 responses
                                48 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-04-2026, 08:59 AM
                                0 responses
                                49 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...