Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • what should i do with multiple mapped reads ?

    I am new in ChIP-seq data analysis, and I have a question.
    As you know, some reads often have >=2 valid alignments in genome reference. I see some authors liked to discard these un-uniquely mapped reads and retain the uniquely mapped reads.

    Do some other strategies exist and why
    or some papers specially talk about this?

    if you know the corresponding literature or others, please tell me, thanks.

  • #2
    I have worked and am still working with RNA-Seq data. But I guess the rationale remains the same. If its however totally irrelevant, please forgive me!

    If a particular read has more than 1 alignment to the genome, it just raises the ambiguity. The problem boils down to, how do you decide where to put your read? Now, this could be alleviated a little if you were to employ paired-end sequencing (~Chip-PET?) thereby, with the help of inner distance (and the size of fragment selected), you should be able to assign 1 position, given that the other pair is uniquely mapped.


    I also think the effect of it depends on the type of application you use it for. Suppose you use it for allele specific expression, then if a read maps to multiple locations of the genome and you arbitrarily assign a location, then, you are providing possibly "false" evidence about the expression of that particular allele. It could also be the same in the case of just calling SNPs. I don't, however, see the point of removing them for say gene expression studies, although its certain that there's still ambiguity (and that with deep-sequencing, the reads with non-unique hits are really not necessary).

    This is just 1 step in the pipeline and usually is mentioned in passing (in a line or so in Methods section) in papers usually.

    Comment


    • #3
      Originally posted by cedance View Post
      I have worked and am still working with RNA-Seq data. But I guess the rationale remains the same. If its however totally irrelevant, please forgive me!

      If a particular read has more than 1 alignment to the genome, it just raises the ambiguity. The problem boils down to, how do you decide where to put your read? Now, this could be alleviated a little if you were to employ paired-end sequencing (~Chip-PET?) thereby, with the help of inner distance (and the size of fragment selected), you should be able to assign 1 position, given that the other pair is uniquely mapped.


      I also think the effect of it depends on the type of application you use it for. Suppose you use it for allele specific expression, then if a read maps to multiple locations of the genome and you arbitrarily assign a location, then, you are providing possibly "false" evidence about the expression of that particular allele. It could also be the same in the case of just calling SNPs. I don't, however, see the point of removing them for say gene expression studies, although its certain that there's still ambiguity (and that with deep-sequencing, the reads with non-unique hits are really not necessary).

      This is just 1 step in the pipeline and usually is mentioned in passing (in a line or so in Methods section) in papers usually.
      Thanks very much for your response. it is helpful.
      What do you think about the following strategy?
      When the percentage of mapped unique reads is low (e.g 40%~50%), we can consider randomly selecting one of the multiple mapped positions of non-unique reads. Absolutely, it would introduce new false positive rate but the question is how large is the effect? DO you have papers about this or something else?
      Last edited by Triple_W; 01-11-2012, 06:56 PM.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 11:49 AM
      0 responses
      13 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-24-2024, 08:47 AM
      0 responses
      16 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      61 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Working...
      X