Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Aligning to HIV reference

    Hi everyone,

    First time poster here

    I'm trying to align some Illumina paired reads (88bp) to the HIV1 reference (NC_001802.1, ~10k bp) using Bowtie2. However, somehow I found that these reads only align to the first half of the reference (refer to attachment).

    This seems to be consistent across aligners (Bowtie2 & BWA), so I'm suspecting that it might be a characteristic of HIV. Has anyone come across any similar findings or suggestions?

    Thanks!
    Attached Files

  • #2
    Originally posted by hifer View Post

    [...]I'm suspecting that it might be a characteristic of HIV. Has anyone come across any similar findings or suggestions?

    Thanks!
    Unlikely, the only ambiguous part of the sequence are the LTRs.
    1. Check your sequencing result manually if you have reads from the other half
    2. Check your alignment statistics, do most of the reads align?
    3. Check if the genome sequence you used to build the index was complete.

    Tomek

    Comment


    • #3
      What kind of sample was used for library preparation? Maybe it was amplicon?
      (then only part of HIV genome was sequenced)

      Comment


      • #4
        What parameters are you passing to bowtie when doing the alignment?
        I'm curious because I am doing something similar, aligning deep sequencing reads from a v3 portion of env to NC_001802.

        If I use the default bowtie2 parameters (--end-to-end mode), I get no alignments at all. However if I align in --local mode, the middle of my reads align but about a quarter of the read is soft-clipped at each end.

        Code:
        bowtie2 -x ../references/NC_001802 --local -N 1 -f -U sequences.fasta -S alignment.sam

        I'm trying to figure out if this means I did not properly clean up the adapters on each end, or if there is just too much variation from the reference that bowtie is never going to be able to do a good job.

        Comment


        • #5
          To follow-up on this notion of using bowtie2 to align sequences to an HIV-1 reference -

          My experience is that aligning HIV-1 sequences using Bowtie 2 does not work well. I suppose it depends on the reference and on the region being sequenced, but the diversity of HIV genomes, on top of sequencing errors, is often too great to be aligned by seed-and-extend techniques used by bowtie. A colleague suggested using smith-waterman alignment, or profile HMMs.

          Below is an image taken from IGV showing the diversity of the gp-120 sequences I am working with. (Each color bar represents a position where the alignment differs from the reference.) These sequences were aligned to NC_001802 using a codon-aware smith-waterman aligner . It is this diversity, and divergence from the reference, that make it so hard for bowtie2 to do the alignment.

          Comment


          • #6
            Hi everyone,

            Sorry for the hiatus after posting this. Just as an update, I've managed to figure out why my reads are only aligning to the first half of HIV1, it's due to the way the library was prepared.

            Apparently my collaborators have only sequenced the gag pol region of the virus (as this region is highly conserved) and therefore the reads only align to this particular region of the viral chromosome, no surprises afterall!

            Thanks everyone for their input!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            30 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            32 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Working...
            X