Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question about allignment of reads to duplicate regions

    Hello,

    Perhaps you can help me out with an issue I am having. In my current project I am trying to identify a mutated gene containing a deletion. The mutated genome was sequenced using nextgen paired end sequencing and the reads were aligned using Bowtie. I have identified a candidate gene, however, the reference genome shows that the candidate gene has a duplicate within 200 kilobases.

    My question concerns to what degree this duplicate could be affecting the alignment. Is it likely that the observed coverage gap in the candidate is due only to the effect that the duplicate might have on the overall alignment?

    Thanks

  • #2
    Why not just simulate some reads covering that area with and without the deletion, align them, and see if the duplicate is screwing with their mapability?

    Comment


    • #3
      An exact duplicate with default alignment settings (return only the best hit, and randomly choose one if there are multiple equally good hits) would result in reads aligning to either copy of candidate being randomly assigned. Ignoring all other factors (contamination, sequencing errors, sequences elsewhere in the genome, insufficient coverage, etc), you would expect the aligned reads to show a deletion signal in both your candidate and the duplicate if they have the same sequence around the deletion region.

      Comment


      • #4
        How big is the deletion, relative to the library size? Are we talking indel-sized, or deficiency-sized? dpryan's suggestion works for deletions contained within a read, but if it's a large chunk, larger than the library, then you'd be simulating coverage differences, which would give you whatever answer you want.

        I agree with dcameron that you should see the decrease in coverage around the deletion in both loci with the random assignment of perfect alignments to both, assuming the "duplicate" is a perfect sequence match. However, I've seen bowtie2 alignments (SE, not PE) with a strong preference towards the leftmost occurrence of a duplicated sequence in a reference genome, suggesting it wasn't really random assignment.

        Perhaps local SNPs near flanking regions at either loci could help resolve whether the alignments were placed correctly or not.

        Comment


        • #5
          Originally posted by andylemire View Post
          How big is the deletion, relative to the library size? Are we talking indel-sized, or deficiency-sized? dpryan's suggestion works for deletions contained within a read, but if it's a large chunk, larger than the library, then you'd be simulating coverage differences, which would give you whatever answer you want.
          Good point! I should have mentioned that I was assuming a smallish deletion.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Advancing Precision Medicine for Rare Diseases in Children
            by seqadmin




            Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
            12-16-2024, 07:57 AM
          • seqadmin
            Recent Advances in Sequencing Technologies
            by seqadmin



            Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

            Long-Read Sequencing
            Long-read sequencing has seen remarkable advancements,...
            12-02-2024, 01:49 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 12-17-2024, 10:28 AM
          0 responses
          33 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-13-2024, 08:24 AM
          0 responses
          49 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-12-2024, 07:41 AM
          0 responses
          34 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-11-2024, 07:45 AM
          0 responses
          46 views
          0 likes
          Last Post seqadmin  
          Working...
          X