Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    The headers look a bit different than normal - normally they end in /1 or /2

    e.g.,

    @HWUSI-EAS611_14:8:1:1489:931/1
    and

    @HWUSI-EAS611_14:8:1:1489:931/2

    for the paired end data - note the spaces in yours probably cause problems as well.

    Chris

    Comment


    • #17
      The files are as they were provided as output from Illumina HiSeq 2000. Are you sure the file format is corrupted?

      Comment


      • #18
        These are Illumina 1.8 pipeline files which means they are Sanger quality encoded, so the correct Bowtie flag is --phred33-quals, which I think is the default for Bowtie anyway. Pairs look fine. My money is on a very large insert, try -X 1000 to start with and then see what the average pair distance is.

        Comment


        • #19
          Originally posted by nickloman View Post
          My money is on a very large insert, try -X 1000 to start with and then see what the average pair distance is.
          Yesss! Now it works. I've got 73% reads mapped. Looks like the issue is resolved. Thank you very much, all who contributed to this thread!

          Comment


          • #20
            Paired-end Solexa data mapping with Bowtie

            Hello all,

            I have a similar problem as rebrendi. I am using Illumina HiSeq reads to map with bowtie to a reference genome. When I use the files before running any quality filtering, 43.86% of my paired end reads map to the reference. I filtered by quality using the FASTX-TOOLKIT and removed 29% of read1 and 39% of read2. When I tried to map these reads to the same reference genome 16 million reads paired up (according to flagstat), but only ~600 mapped. Individually, over 80% of each of the read files map to the reference, but together there is almost nothing. By increasing the insert size (-X) from 600 to 2000 I have increased the number of reads mapping to ~3000, but this is still a long way from the millions of reads I expect. I did run a bioanalyzer on my samples and the fragment sizes should be around 500bp, therefore I did not expect a dramatic increase in reads mapping with an increased insertion size. Does anyone have any recommendations about how to tweek the parameters of bowtie to get better mapping? Thanks!!

            Comment


            • #21
              Does your quality trimming also discard sequences if they are getting too short? In this case the sequence-by-sequence order which is required for paired-end alignments might have gotten out of sync, and this could well explain such a dramatic drop in mapping efficiency.

              Alternatively, could it be that your fragment length is not as long as you expect? It is quite common for e.g. 2x100bp reads to completely overlap each other, like this:

              ------------------------------------> read 1
              <----------------------------------- read 2

              If reads are completely contained within each other, Bowtie 1 will regard the alignment as invalid (which is arguably not the most sensible thing to do...). To find out whether this is the case here you could either hard-trim all your sequences by 1bp on the 3' end, so that the reads do not start and end at the same position, like so:

              ----------------------------------->. read 1
              .<----------------------------------- read 2

              Or soft-trim by using the option --trim3 1.

              Good luck!

              Comment


              • #22
                Thanks for the post fkrueger. After my initial poor mapping, I was afraid that the QC had disrupted the order of the sequences so I ran a method which made certain that even if all of the sequence was discarded due to poor quality, the other information was not deleted, so the two files still have the same number of lines and they sync up. Taking your suggestion I looked over the first few lines of both files and they are not reverse compliments of each other leading me to believe that they are not likely overlapping sequence.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                25 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                28 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                24 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                52 views
                0 likes
                Last Post seqadmin  
                Working...
                X