Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Samtools flagstat 0+0 mapped

    I can't seem to figure out why 0% is mapped. The bam file was obtained after aligning paired-end fastq files against my custom fasta reference file using 'bwa mem'.

    samtools flagstat sample.bam
    51174 + 0 in total (QC-passed reads + QC-failed reads)
    0 + 0 duplicates
    0 + 0 mapped (0.00%:nan%)
    51174 + 0 paired in sequencing
    25587 + 0 read1
    25587 + 0 read2
    0 + 0 properly paired (0.00%:nan%)
    0 + 0 with itself and mate mapped
    0 + 0 singletons (0.00%:nan%)
    0 + 0 with mate mapped to a different chr
    0 + 0 with mate mapped to a different chr (mapQ>=5)

    Any pointers will be much appreciated. Thanks!

  • #2
    Apparently none of the reads aligned. Flagstat is a pretty simple program, so it's unlikely to have made a mistake.

    Comment


    • #3
      Are your reads longer than 70 bp (there are not a lot)? Have you tried the bwasw option?

      Comment


      • #4
        Yes, they are longer than 70 which is why I did not use bwa sw. Reads are 252 in length and my reference is slightly shorter than that.

        Comment


        • #5
          Other options to try:

          Subread aligner: http://subread.sourceforge.net/
          BFAST: http://sourceforge.net/projects/bfast/files/
          Perhaps plain BLAT:http://genome.ucsc.edu/FAQ/FAQblat.html

          Comment


          • #6
            I guess the part that I fail to mention is that I have BS-seq data.

            Comment


            • #7
              You can't hope to map BS-seq reads with bwa mem (or any other standard aligner, unless you have ~100% methylation in a read). Try bismark or bison.

              Comment


              • #8
                When I do a BLAST of my sequence against my custom reference sequence, I get a "max score" of 91.5 and looking at the alignment, the mismatches are between C's and T's (which makes sense and tells me that I probably have the correct sequence). However, when I run Bismark using my custom reference, I get 0% methylation, but when I run it with the chromosome fasta file as reference, I get percentage methylation which just puzzles me!! Also, I get methylation percentages when I run my sequences as single-ended but not as paired-end....which again puzzles me!! And it appears that Bismark outputs an "overall" methylation as opposed to methylation at each CpG?? Thank you very much for all the help thus far!

                Comment


                • #9
                  It might be helpful if you posted some example reads and your custom reference. Bismark comes with a methylation extractor that extracts the per-CpG (or per C) metrics from the alignments (the bit printed to the screen at the end of the alignment step is just meant to give an overview). If you still have the alignments produced by Bismark, you might just run the methylation extractor (have it output a bedGraph file, which is usually more useful in my experience).

                  BTW, what's the difference between your custom reference and the chromosome fasta file? Is this a targeted bisulfite sequencing experiment and you just extracted the targeted regions from a given chromosome into a separate fasta file or something else?

                  I should note that if the methylation metrics from Bismark don't make any sense (presumably you have some background expectations of what they should be), then I can have a look if you post some reads and the relevant information about your reference (alternatively, you can just send me a private message and we can connect more easily via email, after which I'll update this thread so someone coming to this later knows what the issue was).

                  BTW, Bismark and Bison have a default maximum insert size that could, given your read lengths, be too small for your data. That would explain why you can get alignments when you treat things as single-ended. Edit: another possibility is that default library type isn't appropriate for your data.
                  Last edited by dpryan; 01-15-2014, 11:16 AM.

                  Comment


                  • #10
                    Devon: The "reference" in this case is a small fragment (< 250 bp) per #4.

                    Comment


                    • #11
                      Originally posted by GenoMax View Post
                      Devon: The "reference" in this case is a small fragment (< 250 bp) per #4.
                      Ah, not sure how I missed that!

                      I expect it's the end-to-end alignment that's not working, then. That's the default for Bison and the only option for Bismark. Using local alignment (as is done by Blast) might solve the problem. Bison allows that, but I have a feeling that the methylation calls produced are wrong (this is on my list of things to test and fix for the next release). This could prove to be a good test dataset for that.

                      Alternatives would be to (1) expand the reference so that it's longer or (2) trim the reads so that they're smaller. Option 1 would be better than 2, but just using local alignment would be the better long-term solution.

                      @mvijayen: Keep in mind that aligning to a subset of a reference can have some downsides. Namely, no targeting is perfect, so you likely have reads not originating from this area. So your error rate will be higher (how much of an issue this is will depend on the region and method used).

                      Comment


                      • #12
                        Just to update the rest of the forum, mvijayen and I conferred off-forum and it turns out that it was indeed the end-to-end alignment that was screwing things up. After using example data and a reference sent by mvijayen to test things, Bison now properly supports local alignment, which seems to produce quit good results with this sort of dataset. Example usage is:

                        Code:
                        mpiexec -n 5 bison_herd --local -g ./ -1 sample_1.fq -2 sample_2.fq

                        Comment


                        • #13
                          @dpryan: thank you for taking a look at my data.

                          Now, it appears that Bison does not work with openmpi_1.4.1?? I am getting the following error:
                          You're MPI implementation doesn't support MPI_THREAD_MULTIPLE, which is required for bison_herd to work.
                          --------------------------------------------------------------------------
                          mpiexec has exited due to process rank 0 with PID 22022 on
                          node helium-login-0-2.local exiting without calling "finalize". This may
                          have caused other processes in the application to be
                          terminated by signals sent by mpiexec (as reported here).
                          --------------------------------------------------------------------------

                          Secondly, when I try running it after installing mpich2,this is the error I am getting:
                          Command line:
                          mpiexec -n 5 bison_herd -g /Users/mvijayen/bison/make_install/ref.fa -o /Users/mvijayen/bison/output/ -1 /Users/mvijayen/seq_data/sample_1.fastq -2 /Users/mvijayen/seq_data/sample_2.fastq
                          Error message:
                          ./bison_herd: error while loading shared libraries: libmpi.so.0: cannot open shared object file: No such file or directory

                          ===================================================================================
                          = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
                          = EXIT CODE: 127
                          = CLEANING UP REMAINING PROCESSES
                          = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
                          ===================================================================================

                          My directory "make-install" contains both the files installed from Bison (i.e. bison_herd) and also those installed from mpich2 (i.e. mpiexec). I've read other threads that say that the error may be due to mpich2 not being on my path, but from what I can tell, it is.

                          Comment


                          • #14
                            Verify that LD_LIBRARY_PATH has the path for the library as expected by your executable. See this thread on how to check that: http://stackoverflow.com/questions/9...aviour-in-bash

                            Comment


                            • #15
                              Just checked LD_LIBRARY_PATH and the directory that houses my executable files, "/Users/mvijayen/bison/make_install" is present. But I can't say if that is sufficient because I don't quite follow the meaning of "expected by your executable". Also, according to the thread, when the full path was used, mpirun worked. However, when I try with the full path, I am still getting the same error. Where is even this libmpi.so file? I don't see it in any of my mpich2 folder.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              22 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              24 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              20 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              52 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X