Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • different number of sequences in a paired end alignement with Bowtie2?

    Hello,

    I am very new to using Bowtie2, and would love some help!

    I am hoping to do a paired end alignment. I had the same number of sequences in each file originally. First, I did a quality trim using FASTX to remove low quality sequence at the end. Then, when I try to do my paired end alignment, I get an error message saying that I had a different number of sequences in the two files. I'm guessing that perhaps one (or more) of my sequences was of a low quality all the way through and removed altogether...?

    What is the best way around this problem? Is there an option in Bowtie2 to deal with an unequal number of sequences?

    Thanks so much!

  • #2
    Just use a different read trimmer, for example trimmomatic or trim_galore. Fastx tools are known to cause problems like this.

    Comment


    • #3
      Originally posted by jesstilla View Post
      What is the best way around this problem? Is there an option in Bowtie2 to deal with an unequal number of sequences?
      dpryan has the correct ultimate solution -- don't use fastx directly because it destroys any pairing. However in answer to your specific question above the answer is "No, not if you want to consider your sequences as pairs." On the other hand if you wish to treat your reads as single ends then you can use the '-U' parameter to input the reads to bowtie2.

      Comment


      • #4
        Originally posted by jesstilla View Post
        What is the best way around this problem? Is there an option in Bowtie2 to deal with an unequal number of sequences?
        For the first question, you need to sync your read pairs and to your second question, I don't think so. The solution is to re-pair your reads and there are a number of posts on seqanswers about that topic. The other comments about using a different trimmer are ways to avoid this in the future (prinseq keeps pairs also), but that doesn't solve your current problem. Retrimming your reads with another tool because they are out of sync seems like a silly approach to me because you've already spent time trimming. Just sync your paired reads now and then you'll be able to use the read pairs and the singletons created by trimming.

        Comment


        • #5
          Thanks so much for getting back to me!

          I used Trimmomatic to trim my sequences, and I unfortunately had the same problem when I tried to use Bowtie2 afterward. The error message says "fewer reads in file specified with -2 than in file specified with -1 terminate called after throwing an instance of 'int' bowtie2-align died with signal 6 (ABRT) (core dumped)"

          I'm not sure what I'm doing wrong, but I included the arguments I used in case anything jumps out at you.

          For Trimmomatic:
          bsub java -jar /cluster/tufts/dopmanlab/Jessie2/nexteraJLM/Trimmomatic-0.32/trimmomatic-0.32.jar PE -phred33 Sample_ACB5.R1.fastq Sample_ACB5.R2.fastq ACB5_trim_pairfor_prac.fastq ACB5_trim_pairrev_prac.fastq ACB5_trim_unpairfor_prac.fastq ACB5_trim_unpairrev.fastq TRAILING:30

          For Bowtie2:
          bsub -o <output file> -e <error file> /cluster/tufts/ngsp/ngsp/bowtie2-2.1.0/bowtie2 --very-sensitive --phred33 --no-unal -x <name of reference> -1 <forward file> -2 <reverse file> -I 300 -X 550 -S <SAM output file>

          Thanks a bunch!

          Comment


          • #6
            Hmmm.... did you check to make sure they had the same number of reads before trimming?

            Comment


            • #7
              Well, Bowtie2 will do the alignment when I use the untrimmed files, so I assumed that there were the same number of sequences in each file...

              Comment


              • #8
                In Linux, you can do:
                wc -l Sample_ACB5.R1.fastq Sample_ACB5.R2.fastq

                And it will tell you how many lines there are in each of them. But I'm not sure what you're doing with giving trimmomatic 6 files at once. I've not used it, but normally if you want a program to understand that data is paired, you can only give it 2 files at a time.

                Comment


                • #9
                  Originally posted by jesstilla View Post
                  Well, Bowtie2 will do the alignment when I use the untrimmed files, so I assumed that there were the same number of sequences in each file...
                  You are probably correct in that case, though it's odd you had a problem after trimming with a program that is aware of the paired reads. I would take the same approach as suggested above, which is check the read counts in each file before and after trimming. It may be that the command is not correct, but I'm not a trimmomatic user so I can't say. I think the best approach would probably be to check read numbers and then start investigating further.

                  Comment


                  • #10
                    I figured it out! There was a misordering of the output files in the trimmomatic script. Thanks to everyone for your help!

                    Comment


                    • #11
                      Originally posted by Brian Bushnell View Post
                      In Linux, you can do:
                      wc -l Sample_ACB5.R1.fastq Sample_ACB5.R2.fastq

                      And it will tell you how many lines there are in each of them. But I'm not sure what you're doing with giving trimmomatic 6 files at once. I've not used it, but normally if you want a program to understand that data is paired, you can only give it 2 files at a time.
                      Brian. You *really* need to use Trimmomatic at some point. Indeed Trimmomatic requires 6 files -- two input and four output.

                      Comment


                      • #12
                        Originally posted by westerman View Post
                        Brian. You *really* need to use Trimmomatic at some point. Indeed Trimmomatic requires 6 files -- two input and four output.
                        Haha, silly me. I guess that makes sense. For some reason everyone at JGI uses interleaved files for everything.

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Current Approaches to Protein Sequencing
                          by seqadmin


                          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                          04-04-2024, 04:25 PM
                        • seqadmin
                          Strategies for Sequencing Challenging Samples
                          by seqadmin


                          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                          03-22-2024, 06:39 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 04-11-2024, 12:08 PM
                        0 responses
                        31 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 10:19 PM
                        0 responses
                        32 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 09:21 AM
                        0 responses
                        28 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-04-2024, 09:00 AM
                        0 responses
                        53 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X