Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RNA-seq read coverage questions

    Hi there,

    I have a question about reads coverage and RNA-seq. I am analyzing our Illumina paired-end data that we obtained from bacterial mRNA with Artemis (alignement with BWA) and I noticed some "funny" coverage profiles. With all the experts on this forum, I am sure one of you will be able to help me.

    case #1

    On this picture we can see that gene B is expressed a lot compared to others. Also, we can see that the signal obtained from gene B's mRNA seems to "decay" on the beginning of gene A and the inter-ORF. How can this phenomenon can be explained ?



    case #2

    Notice the big bump upstream of gene B. It looks like some unreliable annotation or some cryptic-mRNA, what do you think ?


    Is there any software / pipeline that takes into account case #1 and also discover/correct known genome annotations ?

    Thank you for your comments and answers,

    pasta
    Last edited by pasta; 02-09-2011, 08:27 AM.

  • #2
    what gene model are you using?

    Comment


    • #3
      John,
      We used YACOP which uses several ORF finders: Critica, Glimmer and Z-curve.

      Comment


      • #4
        It could possibly be a number of things worth investigating including- PCR dup. removal (dependent on the number of PCR cycles you did), 5'/3' bias dependent on the method for creating your cDNA library, which can happen during fragmentation of your isolated mRNA too, an unannotated gene in your gene model (maybe try something like refSeq or ensure all the transcripts in your gene model are present), or a repetitive region upstream of your gene, which caused read-mapping difficulties.

        Comment


        • #5
          In many bacterial genomes, the genes are quite tightly packed on to the genome.
          Check out http://microbes.ucsc.edu/cgi-bin/hgTracks to see for yourself.

          It is possible that they are genes. You may have to get that sequence (area in question) and tune down the parameters to see if they match a domain using your favorite motif finding software. You might run a blast to see if there's homology to another organism.

          Other possibilities is that they are regulatory elements.

          The region may not be unique. Check the bwa flags for the reads for more insight. I guess, in bacteria, two "snp" values might tell you there's a dupe.

          Just some thoughts, I'm no bacteria expert.
          Last edited by Richard Finney; 02-09-2011, 09:15 AM.

          Comment


          • #6
            You might also want to check for fRNA contamination. It's a possibility...

            Comment


            • #7
              Thank your for these answers, that's very nice from you. I forgot to mention that all rRNA sequences were removed from our analysis.
              For case #2, I blasted the sequence : no homology found; however I found 1 nice promoter sequence. FYI, genes A and B are DNA a integration protein and a transposase respectively. That's vey interesting !

              Do you have any explanation for the first case ?

              Comment


              • #8
                What was your method of cDNA synthesis/library construction? It could be an artifact of these processes.

                --
                Phillip

                Comment


                • #9
                  Originally posted by pmiguel View Post
                  What was your method of cDNA synthesis/library construction? It could be an artifact of these processes.

                  --
                  Phillip
                  Total RNA was treated twice with MicrobExpress (ambion) to remove most rRNA.
                  mRNA was fragmented to prepare cDNA with hexanucleotides as primers and RNase H was used on the other strand. Then, Illumina adapters were added before the PCR.
                  Someone told me that the behavior that we can see in case #1 is rather normal with prokaryots. Transcription does not stop exactly at the end of the ORF, some mRNA can be longer. What do you think ?

                  Thanks

                  antoine

                  Comment


                  • #10
                    Yes, I would buy that explanation.

                    Prokaryotic messages are said to be rapidly turned-over. If this turn-over takes the form of exonucleases, that also would cause lower 5' and 3' ends in your sequencing results.

                    --
                    Phillip

                    Comment


                    • #11
                      @case 1:
                      I believe it might be because of methodological bias. Some methods preferentially enrich 5'-ends of mRNAs while others do so for 3'-ends.

                      Your method is not strand-specific, so you cannot tell, whether you see Gene B downstream transcript or actually the gene A transcript. So, your "procaryotic" explanation is also possible, though I wouldn't expect such a long tail (just a feeling, however)

                      @case 2:
                      I'll vote for repetitive region here. You say, gene B's a transposase? Such genes move genomic elements within an between genomes, often integrating at similar sites and carrying additional DNA. While the transposase itself can be a repeat within the genome, I would also expect to find more repetitive sequence in the vicinity.

                      Comment


                      • #12
                        Thank you very much for your explanations, I appreciate. I am really starting to understand The Biology behind the data, if that makes sense.

                        Thanks again !

                        Toni

                        Comment


                        • #13
                          Originally posted by pasta View Post
                          John,
                          We used YACOP which uses several ORF finders: Critica, Glimmer and Z-curve.
                          i want to use Orpheus and Z-curve along with it but unable to find it anywhere on the web. Did you use it? Can you tell from where i can download these two.

                          regards,
                          adnan
                          ~Adnan~

                          Comment


                          • #14
                            Not being a biologists, and never having worked with procaryotes, I apologize if this question might be stupid, but: Bacteria don't have UTRs? Not only translation but also transcription starts exactly at the start codon and stops at the stop codon? Otherwise, what is surprising about the transcript reaching beyond the gene boundary, if your gene model comes from an ORF finder?

                            I'm working a lot in yeast, and there, many genes look like case #1. It seems as the promoter recruits the polymerase to a quite well defined position where transcription starts, but where it stops (or more precisely: where the poly-A tail is placed) seems to be rather a region, or a colelction of several possible places, given the 3' end this "decaying" appearance. As for case #2: there are so many non-coding transcripts in eukaryotes (and in prokaryotes as well, maybe?) that I would be rather surprised if I did not find transcripts that don't overlap with an ORF.
                            Last edited by Simon Anders; 05-01-2011, 10:56 PM.

                            Comment


                            • #15
                              I know it's an old post but in case people are still reading it, I wanted to add that bacterial certainly DO have UTRs, so it is normal and expected that transcription from two convergent genes will overlap. Bacterial terminators are also not always sharp; transcription can end over a range of positions downstream of the stop codon. Gene "B" in the example fits this pattern.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              11 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              51 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              68 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X