Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • pasta
    Member
    • Jan 2011
    • 27

    RNA-seq read coverage questions

    Hi there,

    I have a question about reads coverage and RNA-seq. I am analyzing our Illumina paired-end data that we obtained from bacterial mRNA with Artemis (alignement with BWA) and I noticed some "funny" coverage profiles. With all the experts on this forum, I am sure one of you will be able to help me.

    case #1

    On this picture we can see that gene B is expressed a lot compared to others. Also, we can see that the signal obtained from gene B's mRNA seems to "decay" on the beginning of gene A and the inter-ORF. How can this phenomenon can be explained ?



    case #2

    Notice the big bump upstream of gene B. It looks like some unreliable annotation or some cryptic-mRNA, what do you think ?


    Is there any software / pipeline that takes into account case #1 and also discover/correct known genome annotations ?

    Thank you for your comments and answers,

    pasta
    Last edited by pasta; 02-09-2011, 08:27 AM.
  • JohnK
    Senior Member
    • Feb 2010
    • 106

    #2
    what gene model are you using?

    Comment

    • pasta
      Member
      • Jan 2011
      • 27

      #3
      John,
      We used YACOP which uses several ORF finders: Critica, Glimmer and Z-curve.

      Comment

      • JohnK
        Senior Member
        • Feb 2010
        • 106

        #4
        It could possibly be a number of things worth investigating including- PCR dup. removal (dependent on the number of PCR cycles you did), 5'/3' bias dependent on the method for creating your cDNA library, which can happen during fragmentation of your isolated mRNA too, an unannotated gene in your gene model (maybe try something like refSeq or ensure all the transcripts in your gene model are present), or a repetitive region upstream of your gene, which caused read-mapping difficulties.

        Comment

        • Richard Finney
          Senior Member
          • Feb 2009
          • 701

          #5
          In many bacterial genomes, the genes are quite tightly packed on to the genome.
          Check out http://microbes.ucsc.edu/cgi-bin/hgTracks to see for yourself.

          It is possible that they are genes. You may have to get that sequence (area in question) and tune down the parameters to see if they match a domain using your favorite motif finding software. You might run a blast to see if there's homology to another organism.

          Other possibilities is that they are regulatory elements.

          The region may not be unique. Check the bwa flags for the reads for more insight. I guess, in bacteria, two "snp" values might tell you there's a dupe.

          Just some thoughts, I'm no bacteria expert.
          Last edited by Richard Finney; 02-09-2011, 09:15 AM.

          Comment

          • JohnK
            Senior Member
            • Feb 2010
            • 106

            #6
            You might also want to check for fRNA contamination. It's a possibility...

            Comment

            • pasta
              Member
              • Jan 2011
              • 27

              #7
              Thank your for these answers, that's very nice from you. I forgot to mention that all rRNA sequences were removed from our analysis.
              For case #2, I blasted the sequence : no homology found; however I found 1 nice promoter sequence. FYI, genes A and B are DNA a integration protein and a transposase respectively. That's vey interesting !

              Do you have any explanation for the first case ?

              Comment

              • pmiguel
                Senior Member
                • Aug 2008
                • 2328

                #8
                What was your method of cDNA synthesis/library construction? It could be an artifact of these processes.

                --
                Phillip

                Comment

                • pasta
                  Member
                  • Jan 2011
                  • 27

                  #9
                  Originally posted by pmiguel View Post
                  What was your method of cDNA synthesis/library construction? It could be an artifact of these processes.

                  --
                  Phillip
                  Total RNA was treated twice with MicrobExpress (ambion) to remove most rRNA.
                  mRNA was fragmented to prepare cDNA with hexanucleotides as primers and RNase H was used on the other strand. Then, Illumina adapters were added before the PCR.
                  Someone told me that the behavior that we can see in case #1 is rather normal with prokaryots. Transcription does not stop exactly at the end of the ORF, some mRNA can be longer. What do you think ?

                  Thanks

                  antoine

                  Comment

                  • pmiguel
                    Senior Member
                    • Aug 2008
                    • 2328

                    #10
                    Yes, I would buy that explanation.

                    Prokaryotic messages are said to be rapidly turned-over. If this turn-over takes the form of exonucleases, that also would cause lower 5' and 3' ends in your sequencing results.

                    --
                    Phillip

                    Comment

                    • nasobema
                      Member
                      • Jul 2010
                      • 14

                      #11
                      @case 1:
                      I believe it might be because of methodological bias. Some methods preferentially enrich 5'-ends of mRNAs while others do so for 3'-ends.

                      Your method is not strand-specific, so you cannot tell, whether you see Gene B downstream transcript or actually the gene A transcript. So, your "procaryotic" explanation is also possible, though I wouldn't expect such a long tail (just a feeling, however)

                      @case 2:
                      I'll vote for repetitive region here. You say, gene B's a transposase? Such genes move genomic elements within an between genomes, often integrating at similar sites and carrying additional DNA. While the transposase itself can be a repeat within the genome, I would also expect to find more repetitive sequence in the vicinity.

                      Comment

                      • pasta
                        Member
                        • Jan 2011
                        • 27

                        #12
                        Thank you very much for your explanations, I appreciate. I am really starting to understand The Biology behind the data, if that makes sense.

                        Thanks again !

                        Toni

                        Comment

                        • niazi84@hotmail.com
                          Member
                          • Jan 2010
                          • 25

                          #13
                          Originally posted by pasta View Post
                          John,
                          We used YACOP which uses several ORF finders: Critica, Glimmer and Z-curve.
                          i want to use Orpheus and Z-curve along with it but unable to find it anywhere on the web. Did you use it? Can you tell from where i can download these two.

                          regards,
                          adnan
                          ~Adnan~

                          Comment

                          • Simon Anders
                            Senior Member
                            • Feb 2010
                            • 995

                            #14
                            Not being a biologists, and never having worked with procaryotes, I apologize if this question might be stupid, but: Bacteria don't have UTRs? Not only translation but also transcription starts exactly at the start codon and stops at the stop codon? Otherwise, what is surprising about the transcript reaching beyond the gene boundary, if your gene model comes from an ORF finder?

                            I'm working a lot in yeast, and there, many genes look like case #1. It seems as the promoter recruits the polymerase to a quite well defined position where transcription starts, but where it stops (or more precisely: where the poly-A tail is placed) seems to be rather a region, or a colelction of several possible places, given the 3' end this "decaying" appearance. As for case #2: there are so many non-coding transcripts in eukaryotes (and in prokaryotes as well, maybe?) that I would be rather surprised if I did not find transcripts that don't overlap with an ORF.
                            Last edited by Simon Anders; 05-01-2011, 10:56 PM.

                            Comment

                            • sshell
                              Junior Member
                              • Jan 2009
                              • 6

                              #15
                              I know it's an old post but in case people are still reading it, I wanted to add that bacterial certainly DO have UTRs, so it is normal and expected that transcription from two convergent genes will overlap. Bacterial terminators are also not always sharp; transcription can end over a range of positions downstream of the stop codon. Gene "B" in the example fits this pattern.

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                                by SEQadmin2


                                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                                Here are nine questions we think about, in roughly the order they matter, before...
                                06-18-2026, 07:11 AM
                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-17-2026, 06:09 AM
                              0 responses
                              24 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-09-2026, 11:58 AM
                              0 responses
                              41 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              48 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              49 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...