Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • cjp
    Member
    • Jun 2011
    • 58

    #16
    The headers look a bit different than normal - normally they end in /1 or /2

    e.g.,

    @HWUSI-EAS611_14:8:1:1489:931/1
    and

    @HWUSI-EAS611_14:8:1:1489:931/2

    for the paired end data - note the spaces in yours probably cause problems as well.

    Chris

    Comment

    • rebrendi
      ng
      • May 2008
      • 78

      #17
      The files are as they were provided as output from Illumina HiSeq 2000. Are you sure the file format is corrupted?

      Comment

      • nickloman
        Senior Member
        • Jul 2009
        • 355

        #18
        These are Illumina 1.8 pipeline files which means they are Sanger quality encoded, so the correct Bowtie flag is --phred33-quals, which I think is the default for Bowtie anyway. Pairs look fine. My money is on a very large insert, try -X 1000 to start with and then see what the average pair distance is.

        Comment

        • rebrendi
          ng
          • May 2008
          • 78

          #19
          Originally posted by nickloman View Post
          My money is on a very large insert, try -X 1000 to start with and then see what the average pair distance is.
          Yesss! Now it works. I've got 73% reads mapped. Looks like the issue is resolved. Thank you very much, all who contributed to this thread!

          Comment

          • afields
            Junior Member
            • Mar 2012
            • 2

            #20
            Paired-end Solexa data mapping with Bowtie

            Hello all,

            I have a similar problem as rebrendi. I am using Illumina HiSeq reads to map with bowtie to a reference genome. When I use the files before running any quality filtering, 43.86% of my paired end reads map to the reference. I filtered by quality using the FASTX-TOOLKIT and removed 29% of read1 and 39% of read2. When I tried to map these reads to the same reference genome 16 million reads paired up (according to flagstat), but only ~600 mapped. Individually, over 80% of each of the read files map to the reference, but together there is almost nothing. By increasing the insert size (-X) from 600 to 2000 I have increased the number of reads mapping to ~3000, but this is still a long way from the millions of reads I expect. I did run a bioanalyzer on my samples and the fragment sizes should be around 500bp, therefore I did not expect a dramatic increase in reads mapping with an increased insertion size. Does anyone have any recommendations about how to tweek the parameters of bowtie to get better mapping? Thanks!!

            Comment

            • fkrueger
              Senior Member
              • Sep 2009
              • 627

              #21
              Does your quality trimming also discard sequences if they are getting too short? In this case the sequence-by-sequence order which is required for paired-end alignments might have gotten out of sync, and this could well explain such a dramatic drop in mapping efficiency.

              Alternatively, could it be that your fragment length is not as long as you expect? It is quite common for e.g. 2x100bp reads to completely overlap each other, like this:

              ------------------------------------> read 1
              <----------------------------------- read 2

              If reads are completely contained within each other, Bowtie 1 will regard the alignment as invalid (which is arguably not the most sensible thing to do...). To find out whether this is the case here you could either hard-trim all your sequences by 1bp on the 3' end, so that the reads do not start and end at the same position, like so:

              ----------------------------------->. read 1
              .<----------------------------------- read 2

              Or soft-trim by using the option --trim3 1.

              Good luck!

              Comment

              • afields
                Junior Member
                • Mar 2012
                • 2

                #22
                Thanks for the post fkrueger. After my initial poor mapping, I was afraid that the QC had disrupted the order of the sequences so I ran a method which made certain that even if all of the sequence was discarded due to poor quality, the other information was not deleted, so the two files still have the same number of lines and they sync up. Taking your suggestion I looked over the first few lines of both files and they are not reverse compliments of each other leading me to believe that they are not likely overlapping sequence.

                Comment

                Latest Articles

                Collapse

                • SEQadmin2
                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by SEQadmin2


                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                  Here are nine questions we think about, in roughly the order they matter, before...
                  06-18-2026, 07:11 AM
                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  06-02-2026, 10:05 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, Yesterday, 05:37 AM
                0 responses
                5 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-26-2026, 11:10 AM
                0 responses
                16 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-17-2026, 06:09 AM
                0 responses
                50 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                110 views
                0 reactions
                Last Post SEQadmin2  
                Working...