Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counts on single and paired ends reads merged bam file

    Dear all,
    We have made experiments either paired-end AND single-end on the same sample. Next, the 2 corresponding BAM files has been merged, making some difficulties for htseq-count ('pair_alignments' needs a sequence of paired-end alignments).
    A fix is to use the 2 UNmerged BAM files with htseq-count separately, followed by a merge of the results files (simply summing the counts paired-end + single-end).
    Is that approach healthy and recommended?
    Best regards.
    PS: HTSeq rocks!

  • #2
    You might also consider just keeping the single-end and paired-end separate and then using that as a blocking factor in your experimental design. Having said that, if the library-type effect is minimal (as indicated by PCA, clustering, etc.), then you might as well go ahead and sum things...but I'd check that the results are similar enough first.

    Comment


    • #3
      Right.
      I need to update my question, and actually the experiment has been conducted with only paired-end BUT the read quality filtering and trimming downstream steps conducted to the removal of some mate pairs (approx. 10% of pairs become single/orphan).
      So the mixture of single and paired read doesn't come from the biochemistry, but rather from QC filtering.
      We can state that it's healthy to merge back the paired and single in HTSeqc-count, isn't it? Qualitatively speaking et least.
      About the counting, on one hand 1 mapped single read conduct to one count, on the other hand, a paired read will also be counted only once. Don't we overweight the single-end reads by simply summing single + paired end reads?

      Comment


      • #4
        A single read and a pair both describe the position of the fragment that was sequenced. In both cases, you can consider that it's actually the fragment that's getting counted, so then nothing is being given undue weight. The only real objection to that is that single-end reads don't give you the full bounds, so there are cases where they'll lead to slightly inflated counts (e.g., when the other end of the fragment actually overlaps a different feature, but you have no way of knowing this), but the effect of that is likely quite small (again, you could judge this by clustering things).

        Comment


        • #5
          Great answer!
          I agree with that and I'll go on with the proposed strategy, which is the following:
          1. QC of fastq files, trimming ..
          2. Alignment of single read, alignment of paired reads
          3. HTSeq-count of single reads, HTSeq-count of paired reads
          4. Sum of counts for each gene of single reads + paired reads
          5. Happy EdgeR or DESeq or whatever...

          Thank you so much for your help!!

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:37 PM
          0 responses
          10 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 06:07 PM
          0 responses
          9 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          49 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          67 views
          0 likes
          Last Post seqadmin  
          Working...
          X