Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Determine paired end overlapping

    I have a paired end Illumina exome data set that might have overlapped at the ends. My fragment size has 105 bp. I aligned my samples with bwa and generated my bam files with samtools.
    My question is:

    Is there any way to determine how many reads overlapped?

    How can I determine the distance between the paired ends (in case they didn´t overlap) or how much they overlapped?

    If this percentage is high, I am thinking about reanalyzing my data generating fragments of 75 or 50bp. Do you think that’s correct? Which percentage could be the cut-off to consider it high?

    Thanks
    Last edited by chariko; 04-13-2011, 08:29 AM.

  • #2
    in the SAM/BAM the TLEN column will tell you the template length. if it is smaller than 2*105bp you have overlapping ends (if there are no indels).

    try this to see the template length distribution of the first million reads: samtools view PEalignment.bam | head -n 1000000 | cut -f 9 | sort -n | uniq -c

    why would you want to generate fragments?

    Comment


    • #3
      Originally posted by volks View Post
      in the SAM/BAM the TLEN column will tell you the template length. if it is smaller than 2*105bp you have overlapping ends (if there are no indels).

      try this to see the template length distribution of the first million reads: samtools view PEalignment.bam | head -n 1000000 | cut -f 9 | sort -n | uniq -c

      why would you want to generate fragments?
      Thanks for your answer, with the TLEN column I could manage it.

      Regarding your question, the problem of having too much overlap is that I will miss the advantages of an paired end experiment that is detection of structural variants in the genome between the pairs for example. So if I generated 75 bases fragments I would have less overlapping. I know it's better if you work with longer reads but I thought this could be a solution. Also I think there are softwares that can get this info even if pairs overlap but I don´t know yet which of them. Do you have an idea?
      Last edited by chariko; 04-28-2011, 11:57 PM.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 03-27-2024, 06:37 PM
      0 responses
      12 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-27-2024, 06:07 PM
      0 responses
      11 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      68 views
      0 likes
      Last Post seqadmin  
      Working...
      X