Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • mvijayen
    Member
    • Jun 2013
    • 15

    Samtools flagstat 0+0 mapped

    I can't seem to figure out why 0% is mapped. The bam file was obtained after aligning paired-end fastq files against my custom fasta reference file using 'bwa mem'.

    samtools flagstat sample.bam
    51174 + 0 in total (QC-passed reads + QC-failed reads)
    0 + 0 duplicates
    0 + 0 mapped (0.00%:nan%)
    51174 + 0 paired in sequencing
    25587 + 0 read1
    25587 + 0 read2
    0 + 0 properly paired (0.00%:nan%)
    0 + 0 with itself and mate mapped
    0 + 0 singletons (0.00%:nan%)
    0 + 0 with mate mapped to a different chr
    0 + 0 with mate mapped to a different chr (mapQ>=5)

    Any pointers will be much appreciated. Thanks!
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    Apparently none of the reads aligned. Flagstat is a pretty simple program, so it's unlikely to have made a mistake.

    Comment

    • GenoMax
      Senior Member
      • Feb 2008
      • 7142

      #3
      Are your reads longer than 70 bp (there are not a lot)? Have you tried the bwasw option?

      Comment

      • mvijayen
        Member
        • Jun 2013
        • 15

        #4
        Yes, they are longer than 70 which is why I did not use bwa sw. Reads are 252 in length and my reference is slightly shorter than that.

        Comment

        • GenoMax
          Senior Member
          • Feb 2008
          • 7142

          #5
          Other options to try:

          Subread aligner: http://subread.sourceforge.net/
          BFAST: http://sourceforge.net/projects/bfast/files/
          Perhaps plain BLAT:http://genome.ucsc.edu/FAQ/FAQblat.html

          Comment

          • mvijayen
            Member
            • Jun 2013
            • 15

            #6
            I guess the part that I fail to mention is that I have BS-seq data.

            Comment

            • dpryan
              Devon Ryan
              • Jul 2011
              • 3478

              #7
              You can't hope to map BS-seq reads with bwa mem (or any other standard aligner, unless you have ~100% methylation in a read). Try bismark or bison.

              Comment

              • mvijayen
                Member
                • Jun 2013
                • 15

                #8
                When I do a BLAST of my sequence against my custom reference sequence, I get a "max score" of 91.5 and looking at the alignment, the mismatches are between C's and T's (which makes sense and tells me that I probably have the correct sequence). However, when I run Bismark using my custom reference, I get 0% methylation, but when I run it with the chromosome fasta file as reference, I get percentage methylation which just puzzles me!! Also, I get methylation percentages when I run my sequences as single-ended but not as paired-end....which again puzzles me!! And it appears that Bismark outputs an "overall" methylation as opposed to methylation at each CpG?? Thank you very much for all the help thus far!

                Comment

                • dpryan
                  Devon Ryan
                  • Jul 2011
                  • 3478

                  #9
                  It might be helpful if you posted some example reads and your custom reference. Bismark comes with a methylation extractor that extracts the per-CpG (or per C) metrics from the alignments (the bit printed to the screen at the end of the alignment step is just meant to give an overview). If you still have the alignments produced by Bismark, you might just run the methylation extractor (have it output a bedGraph file, which is usually more useful in my experience).

                  BTW, what's the difference between your custom reference and the chromosome fasta file? Is this a targeted bisulfite sequencing experiment and you just extracted the targeted regions from a given chromosome into a separate fasta file or something else?

                  I should note that if the methylation metrics from Bismark don't make any sense (presumably you have some background expectations of what they should be), then I can have a look if you post some reads and the relevant information about your reference (alternatively, you can just send me a private message and we can connect more easily via email, after which I'll update this thread so someone coming to this later knows what the issue was).

                  BTW, Bismark and Bison have a default maximum insert size that could, given your read lengths, be too small for your data. That would explain why you can get alignments when you treat things as single-ended. Edit: another possibility is that default library type isn't appropriate for your data.
                  Last edited by dpryan; 01-15-2014, 11:16 AM.

                  Comment

                  • GenoMax
                    Senior Member
                    • Feb 2008
                    • 7142

                    #10
                    Devon: The "reference" in this case is a small fragment (< 250 bp) per #4.

                    Comment

                    • dpryan
                      Devon Ryan
                      • Jul 2011
                      • 3478

                      #11
                      Originally posted by GenoMax View Post
                      Devon: The "reference" in this case is a small fragment (< 250 bp) per #4.
                      Ah, not sure how I missed that!

                      I expect it's the end-to-end alignment that's not working, then. That's the default for Bison and the only option for Bismark. Using local alignment (as is done by Blast) might solve the problem. Bison allows that, but I have a feeling that the methylation calls produced are wrong (this is on my list of things to test and fix for the next release). This could prove to be a good test dataset for that.

                      Alternatives would be to (1) expand the reference so that it's longer or (2) trim the reads so that they're smaller. Option 1 would be better than 2, but just using local alignment would be the better long-term solution.

                      @mvijayen: Keep in mind that aligning to a subset of a reference can have some downsides. Namely, no targeting is perfect, so you likely have reads not originating from this area. So your error rate will be higher (how much of an issue this is will depend on the region and method used).

                      Comment

                      • dpryan
                        Devon Ryan
                        • Jul 2011
                        • 3478

                        #12
                        Just to update the rest of the forum, mvijayen and I conferred off-forum and it turns out that it was indeed the end-to-end alignment that was screwing things up. After using example data and a reference sent by mvijayen to test things, Bison now properly supports local alignment, which seems to produce quit good results with this sort of dataset. Example usage is:

                        Code:
                        mpiexec -n 5 bison_herd --local -g ./ -1 sample_1.fq -2 sample_2.fq

                        Comment

                        • mvijayen
                          Member
                          • Jun 2013
                          • 15

                          #13
                          @dpryan: thank you for taking a look at my data.

                          Now, it appears that Bison does not work with openmpi_1.4.1?? I am getting the following error:
                          You're MPI implementation doesn't support MPI_THREAD_MULTIPLE, which is required for bison_herd to work.
                          --------------------------------------------------------------------------
                          mpiexec has exited due to process rank 0 with PID 22022 on
                          node helium-login-0-2.local exiting without calling "finalize". This may
                          have caused other processes in the application to be
                          terminated by signals sent by mpiexec (as reported here).
                          --------------------------------------------------------------------------

                          Secondly, when I try running it after installing mpich2,this is the error I am getting:
                          Command line:
                          mpiexec -n 5 bison_herd -g /Users/mvijayen/bison/make_install/ref.fa -o /Users/mvijayen/bison/output/ -1 /Users/mvijayen/seq_data/sample_1.fastq -2 /Users/mvijayen/seq_data/sample_2.fastq
                          Error message:
                          ./bison_herd: error while loading shared libraries: libmpi.so.0: cannot open shared object file: No such file or directory

                          ===================================================================================
                          = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
                          = EXIT CODE: 127
                          = CLEANING UP REMAINING PROCESSES
                          = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
                          ===================================================================================

                          My directory "make-install" contains both the files installed from Bison (i.e. bison_herd) and also those installed from mpich2 (i.e. mpiexec). I've read other threads that say that the error may be due to mpich2 not being on my path, but from what I can tell, it is.

                          Comment

                          • GenoMax
                            Senior Member
                            • Feb 2008
                            • 7142

                            #14
                            Verify that LD_LIBRARY_PATH has the path for the library as expected by your executable. See this thread on how to check that: http://stackoverflow.com/questions/9...aviour-in-bash

                            Comment

                            • mvijayen
                              Member
                              • Jun 2013
                              • 15

                              #15
                              Just checked LD_LIBRARY_PATH and the directory that houses my executable files, "/Users/mvijayen/bison/make_install" is present. But I can't say if that is sufficient because I don't quite follow the meaning of "expected by your executable". Also, according to the thread, when the full path was used, mpirun worked. However, when I try with the full path, I am still getting the same error. Where is even this libmpi.so file? I don't see it in any of my mpich2 folder.

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM
                              • SEQadmin2
                                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                by SEQadmin2

                                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                05-06-2026, 09:04 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Yesterday, 08:59 AM
                              0 responses
                              14 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              22 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              19 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-28-2026, 11:40 AM
                              0 responses
                              32 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...