Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie: Reads fail to align

    I am attempting to align a transcriptome sequenced with Hiseq to a reference using bowtie. The parameters I am using are:

    bowtie -S -p 2 reference -q --phred64-quals

    And none of the reads align. They also do not align if I do not include the quality parameter, or any modification such as the --ff suggested in related postings.

    I have checked for adapter contamination and found very little, the reads were cleaned using ngs backbone, although I am not using that pipeline for anything downstream of cleaning. It has also been reported by some that they do not align in paired end but do in single end, my reads do not align in either case. Around 30% of the reas align in bwa.

    Does anyone have any idea why this is the case?

    Thanks!
    Sarah

  • #2
    The first ten lines of one of the reads files looks like this, if that helps:

    @HWI-ST611_0200:5:1101:1394:2137#0/1
    AATATTTCGATTCGCTATATTCCTCAAATCAAGGTTACAATTTATCAGTCCGGACCAACAAAGCCAAGATCAAGCGTGACACGGGCACACATATGGCAGC
    +
    Z^_ccccc\\S`cfgfa_daK``K[Q^bX[d[bRI^^XYSbBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
    @HWI-ST611_0200:5:1101:1465:2147#0/1
    AATTTTTTCTTAGGTTTGGTCTGCACCACAGTTCGCCTCCAAAATTTTTGATTTTGCGGCCCAACGACTTTTTTTTTAAAGCACGCCTCCAACACAAGCT
    +
    ___cccc]ecgc[Yb[bPbY[^d[bJP^b^^bc]Z_c`_BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
    @HWI-ST611_0200:5:1101:1348:2147#0/1
    AAGTCTTATTGTTATAGGGACTTCACCTGATTTACTCACTTCGAATGAATCAACACATCACAACGCTCACGCCTGCGGCGCAACGTCAACGCGGAGACGC

    Comment


    • #3
      Giving us the reads doesn't help much since without the reference we can not tell if they should map. What would be helpful is the full and exact command line you are using. I would expect a line similar to

      bowtie -B -p 2 reference read.file

      Also helpful would be the command line to BWT that generates the successful hits.

      Comment


      • #4
        This is the full command in bowtie:

        ./bowtie -S -p 2 eletrans -q --phred64-quals -1 lb_gun1.pl_illumina.sm_gun1.sfastq -2 lb_gun2.pl_illumina.sm_gun2.sfastq gun2elebowtie.sam

        The bwa command is:
        ./bwa bwasw -t 2 eletrans lb_gun1.pl_illumina.sm_gun1.sfastq > gun2elebwa.sam

        (single end alignments in bwa, clearly, however single end does not improve alignments for bowtie)

        Comment


        • #5
          No error messages?

          I always put my options (e.g., -S, -p, -q, --phred64-quals) before the reference. But I am not 100% sure that is required.

          How about a 'ls -l' of the *.ebwt files just to make sure that the reference is properly formatted.

          Comment


          • #6
            No error messages. The output looks like this:

            # reads processed: 73455581
            # reads with at least one reported alignment: 0 (0.00%)
            # reads that failed to align: 73455581 (100.00%)
            No alignments

            Here are the reference files:

            -rw-r--r-- 1 sarah staff 7365819 May 10 10:24 eletran.1.ebwt
            -rw-r--r-- 1 sarah staff 1312752 May 10 10:24 eletran.2.ebwt
            -rw-r--r-- 1 sarah staff 37151 May 10 10:23 eletran.3.ebwt
            -rw-r--r-- 1 sarah staff 2625496 May 10 10:23 eletran.4.ebwt
            -rw-r--r-- 1 sarah staff 7365819 May 10 10:24 eletran.rev.1.ebwt
            -rw-r--r-- 1 sarah staff 1312752 May 10 10:24 eletran.rev.2.ebwt

            Comment


            • #7
              The fact that the command says 'eletrans' and the reference listed here is 'eletran' is not significant, I had named it differently at one point.

              Comment


              • #8
                Two suggestions.

                1) What happens if you just treat your reads as single-end; e.g.,

                Code:
                ./bowtie -S -p 2 -q --phred64-quals eletrans  lb_gun1.pl_illumina.sm_gun1.sfastq,lb_gun2.pl_illumina.sm_gun2.sfastq gun2elebowtie.sam
                2) What happens if you use the '--verbose' switch?

                Comment


                • #9
                  This is what happens if you do a single end alignment:

                  ./bowtie -S -p 2 eletrans -q --phred64-quals lb_gun1.pl_illumina.sm_gun1.sfastq lb1gun2elebowtie4.sam
                  # reads processed: 73503003
                  # reads with at least one reported alignment: 478942 (0.65%)
                  # reads that failed to align: 73024061 (99.35%)
                  Reported 478942 alignments to 1 output stream(s)


                  I will try the --verbose switch

                  Comment


                  • #10
                    Originally posted by sasignor View Post
                    The fact that the command says 'eletrans' and the reference listed here is 'eletran' is not significant, I had named it differently at one point.
                    Ah, I had not noticed that. Just why do you think that the change is not significant? With your command line Bowtie should be looking for eletrans.*.ebwt files.

                    Comment


                    • #11
                      Its not significant because I am using the correct name for the reference. I indexed the reference on two occasions and the same ebwt files exist for both eletrans and eletran. Variability in whether or not the command I post is eletrans or eletran depends on what day the command was issued.

                      Comment


                      • #12
                        Another suggestion since it does seem like your single-end works, at least to some extent. Try with switch '-v 3'. That should allow for 3 mismatches. It could be possible that your reads simply do not map very well and that BWA was able to pick up the poor matches better.

                        BTW: If it appears that I am grasping at straws here then, yes, I am. I do not have a good idea of what is happening but rather am going through some troubleshooting steps that I would take.

                        Comment


                        • #13
                          See all those Bs in your quality string? B is the lowest possible quality. It could be that all that sequence is inaccurate, and that's why it won't map.

                          Comment


                          • #14
                            Its true that the reads shown here have a lot of B's, however when I check the whole file less than 1% of the base pairs have this quality score.

                            Comment


                            • #15
                              What does the k-mer plots and sequence overrepresentation plots in FastQC say about your reads? Maybe you do have some problems with unclipped home-brew barcodes or something similar... Do you have in-depth knowledge of the whole sample and read processing pipeline?
                              If there is no obvious problem with the reads, what happens if you run them through a de novo assembler and BLAST (a few of) the resulting contigs? Maybe you got the wrong files from the sequencing provider, and have reads for a different species?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              25 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              28 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              24 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              52 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X