Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • analysis of single and PE reads with htseq-count and DESeq

    Hello,

    I've analyzed some single and paired-end reads that were aligned to the human genome using TopHat and then put them through the htseq-count to DESeq pipeline. A clustering analysis of the variance-stabilized transformed count data (thank you Simon!) suggests that the differences between library types (single vs paired-end) are much bigger than the very significant differences between biological conditions.

    Why might the count data between single and paired-end library types be so different?

    Thanks,
    Danielle
    Last edited by dglemay; 04-26-2012, 10:23 PM. Reason: grammatical error

  • #2
    If you want to get to the bottom of this, you could, for example, redo the alignment and counting of your paired-end data but using only the fastq files from the first pass (the first end). After all, library-prep is the same for single-end and paired-end, and hence, if you simply throw away the second end, the difference should vanish. If so, the whole issue is an artifact of alignment or counting algorithms having subtly different biases when working with paired-end data. If not, it might be some other batch effect that only happens to be confounded with library type.

    Also note the example in the DESeq vignette that shows how to account for such differences by introducing a blocking factor for library type. This only works if your library type is not confounded with your treatment, of course.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      04-22-2024, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Yesterday, 08:47 AM
    0 responses
    12 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    60 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    59 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    54 views
    0 likes
    Last Post seqadmin  
    Working...
    X