Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by billstevens View Post
    I'm sorry, I just read this entire thread, and I still have no idea if my data is strand-specific or not. The core I send it to uses the old TruSeq RNA kit, http://epigenome.usc.edu/docs/resour...15008136_A.pdf
    Bill,

    The TruSeq RNA protocol is NOT directional. The appropriate --library-type option for these libraries is fr-unstranded (which is the default for TopHat).

    Comment


    • #32
      Oh wow, thank you! So that means I needed to use the --stranded=no option using HTSeq? Correct?

      Comment


      • #33
        Originally posted by billstevens View Post
        Oh wow, thank you! So that means I needed to use the --stranded=no option using HTSeq? Correct?
        That is correct.

        Comment


        • #34
          Hello!

          I am analyzing a dataset which, from the Methods section, appears to be directional:

          RNA-seq
          libraries were constructed using Illumina (San Diego, CA) mRNA sequencing kits. Total RNA was subjected to two rounds of oligo- dT purification and then chemically fragmented to approximately 200 bases. Fragmented RNA was used for first-strand cDNA synthesis using random primers and SuperScript II. The second strand was then synthesized using RNaseH and DNA Pol I.
          It was generated on the GaxII.

          After reading this post, I think it is safe to say that the reads on the fastq file correspond to the mRNA molecules that originated them, and therefore, to the coding strand of gene X in the genome as well. Is this correct?

          After revising the library options for TopHat and Cufflinks, would you agree that the appropiate option for TopHat would be --library-type fr secondstrand? And that in Cufflinks I should also indicate that this is a second-stranded library?

          Thanks very much!

          Carmen

          Comment


          • #35
            Originally posted by carmeyeii View Post
            After reading this post, I think it is safe to say that the reads on the fastq file correspond to the mRNA molecules that originated them, and therefore, to the coding strand of gene X in the genome as well. Is this correct?

            After revising the library options for TopHat and Cufflinks, would you agree that the appropiate option for TopHat would be --library-type fr secondstrand? And that in Cufflinks I should also indicate that this is a second-stranded library?
            hello carmen,
            You are right about the first part.
            Though the method "part" you have posted doesn't appear to be for a strand-specific RNA-Sequencing. If it is Strand-specific then the protocol used for generating the library is mentioned. Like whether its the dUTP method or the Illumina strand-specific protocol.
            Look here for all such protocols - http://www.nature.com/nmeth/journal/...meth.1491.html

            and check whether any of such is mentioned in the Methods or Supple Info.

            Comment


            • #36
              Thank you amitm.

              After re-reading it is now clear that they did not use any strand-specific protocol.

              Thanks for your help!

              Carmen

              Comment


              • #37
                Little up for this interessting post.

                I've a problem with my strand-specific data and htseq-count. So I aligned my data (2x50bp - dUTP method) with STAR. After that I extracted the reads with htseq-count :

                htseq-count -s yes gtf.gtf data.sam > htseq.txt

                But I've only a read count of 9 for the gene beside . And there is a lot of other genes with very low gene count.

                With -s no, the read count seems ok.

                Here are a read (and its pair) in sam format that are aligning on this gene (cf figure below)

                Code:
                HWI-ST1172:65:C0RN7ACXX:1:2316:4226:51105	99	chr15	44109457	255	51M	=	44109544	136	TGTAAACGCCGTAGCCGGGGGTCACTGGATGAATCCTCCTCCTGTTCCTCA	CCBFFFFFGHHHHJIJJJJJJ@EIIIJHGFHGFFFFDEEDEEEDCCDDDDD	NH:i:1	HI:i:1	AS:i:98	nM:i:0
                HWI-ST1172:65:C0RN7ACXX:1:2316:4226:51105	147	chr15	44109544	255	49M2S	=	44109457	-136	TGAAATTCTTCATCCTCCTCATCTGAGGACTCCATAGGGGCATAGTCTGCN	EJJJJJIJIJJIJJIIJJJJJJJJJJIGDIIJJJJJJJHHGHHFFDD=4+#	NH:i:1	HI:i:1	AS:i:98	nM:i:0
                So do I have to put -s reverse ? but I don't understand in the gtf file, the gene is encoded on the minus strand and my reads are also aligning on the minus strand. I must miss something..

                Thanks

                N.

                Comment


                • #38
                  Well, let's parse your flag fields.

                  99d = 63h = 0110.0011b means: 1st mate, aligned to plus strand
                  147d = 93h = 1001.0011b means: 2nd mate, aligned to minus strand

                  There you have it. The first mate aligns to the strand opposite to the gene, so you need --stranded=reverse.

                  Comment


                  • #39
                    But why IGV is coloring my reads in red (color alignment by > first-of-pair strand) ?

                    Comment


                    • #40
                      How do you know that red means plus and blue means minus? Maybe it's the other way round.

                      Also note that "first-of-pair" probably means that also the second read is coloured according to the orientation of the first mate.

                      Comment


                      • #41
                        Originally posted by flobpf View Post

                        UPDATE June 10, 2011: I contacted Illumina and they were confused too. However, finally they (and people at TopHat and our sequencing center) resolved the issue. The reads that come out of the machine have the same sequence as the CODING strand of the DNA and not the template strand.
                        Is this true?

                        If that is the case, then

                        a) the illustration above should portray the complementary strand to the RNA being used as template to synthesize (and read) the actual mRNA molecule by the sequencer.

                        or

                        b) The chemistry is as portrayed by the ilustration, but the sequencer translates the base fluorescing to the complementary base, in order to output a read that is the actual sequence of the mRNA molecule?



                        C

                        Comment


                        • #42
                          Originally posted by carmeyeii View Post
                          Quote: Originally Posted by flobpf

                          UPDATE June 10, 2011: I contacted Illumina and they were confused too. However, finally they (and people at TopHat and our sequencing center) resolved the issue. The reads that come out of the machine have the same sequence as the CODING strand of the DNA and not the template strand.
                          Is this true?

                          If that is the case, then

                          a) the illustration above should portray the complementary strand to the RNA being used as template to synthesize (and read) the actual mRNA molecule by the sequencer.

                          or

                          b) The chemistry is as portrayed by the ilustration, but the sequencer translates the base fluorescing to the complementary base, in order to output a read that is the actual sequence of the mRNA molecule?



                          C
                          That information is from two years ago (which is a couple of epochs in NGS time). At that time (I believe) Illumina used a ligation based protocol for directional RNA-Seq. For that library prep protocol the correct --library-type option is fr-secondstrand.

                          Now the Illumina TruSeq stranded RNA-Seq kits use a dUTP second strand marking protocol (like ScriptSeq). The correct option for this protocol is fr-firststrand.

                          You need to make certain you know what library prep protocol was used before trying to interpret strandedness.

                          Comment


                          • #43
                            C

                            Originally posted by kmcarr View Post
                            Now the Illumina TruSeq stranded RNA-Seq kits use a dUTP second strand marking protocol (like ScriptSeq).
                            kcarr, I thought ScriptSeq's directional protocol is actually tag-based?
                            I think it doesn't involver dUTP, does it? -- at least ScriptSeq v2 doesn't?

                            Best,
                            C

                            Comment


                            • #44
                              Originally posted by carmeyeii View Post
                              kcarr, I thought ScriptSeq's directional protocol is actually tag-based?
                              I think it doesn't involver dUTP, does it? -- at least ScriptSeq v2 doesn't?

                              Best,
                              C
                              You're correct; sorry, brain lock on my part.

                              Comment


                              • #45
                                No prob.
                                OK, so just to make sure I'm on the right track -- the sequence that is "spit out" by the sequencer is the actual sequence as seen by the camera... i.e, no base translation, just raw fluorescence -> letter .

                                So if the first read of a pair is the "first [cDNA] strand", THAT one was the strand synthesized during the first round of sequencing, using as template the strand corresponding to the RNA molecule sequence and direction.

                                C

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 06:37 PM
                                0 responses
                                8 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, Yesterday, 06:07 PM
                                0 responses
                                8 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                49 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                67 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X