Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • malachig
    Senior Member
    • Aug 2010
    • 117

    Convert BAM file to FASTQ

    After a quick search I found these:

    Hydra
    Picard (SAMToFastq)
    HudsonAlpha
    Possibly EMBOSS

    Any comments on these? Any other options for BAM-to-FASTQ conversion?

    Basically I want to recover all paired-end reads (both R1 and R2) that were fed into the alignment that produced the BAM file, whether they mapped successfully or not.
    Last edited by malachig; 09-28-2010, 01:04 PM.
  • shurjo
    Senior Member
    • Jan 2009
    • 132

    #2
    I've used Picard and it works fine for me.

    Comment

    • maubp
      Peter (Biopython etc)
      • Jul 2009
      • 1544

      #3
      You may want to filter the BAM file to remove any non-primary mappings (otherwise you could get duplicate entries in the FASTQ file). The tools may do that for you.

      You may also want to append /1 and /2 to the forward and reverse read names (this information isn't currently stored in SAM/BAM format but there is a proposed tag for the read name suffix in the draft standard update).

      Also double check that any reads mapped to the reverse stand get reverse complemented when writing the FASTQ file since you want to recover the input sequences.

      There are also DIY approaches, for example BAM to SAM and then a Perl/Python script. I have some experimental code for Biopython to do this too.

      There was a thread on this on the samtools-help mailing list in August 2010, "BAM to fastq how?"

      Comment

      • ekg
        Member
        • Apr 2010
        • 36

        #4
        Bamtools (http://github.com/pezmaster31/bamtools) can convert BAM to FASTQ.

        bamtools convert -in file1.bam -in file2.bam ... -format fastq >reads.fq

        Comment

        • ElMichael
          Member
          • Jun 2009
          • 31

          #5
          Hi,

          For BamtoFastq convertion I use Bamtools.
          But when I try to convert one of my bam files to fastq I get the following error message
          "BGZF ERROR: read block failed - could not read data from block"
          The problem is that after this step bamtools exits. Is it possible to avoid it? I don't know, somehow to tell bamtools just to skip such block and continue. Or, like in the picard, is there any VALIDATION_STRIGENCY option that could be set lenient or silent?
          Just to mention, these bam files contain unmapped PE reads.
          thanks
          Last edited by ElMichael; 11-15-2010, 10:44 AM.

          Comment

          • KevinLam
            Senior Member
            • Nov 2009
            • 204

            #6
            On Picard,
            my service provider mentioned this
            "Using picard tools directly has one significant drawback. Picard tools will read in sequence from the BAM
            line by line and cache it until it has both reads. Once it has both reads it will print them out and free the
            memory. Unfortunately this means that every read which doesn't have the pairs near each other will
            take memory. In the example above it took 2.5GB of memory for 120GB of sequence but this is not
            guaranteed and will get worse on larger builds.
            "

            Sounds terrible to me..

            fortunately there's method 2

            'You can specify samtools memory usage (it'll use temporary files) so if you sort the BAM by name prior
            to running picard tools on it you guarantee the reads are next to each other and picard tools will barely
            use any memory. '



            side question, was there anything in the original fastq one might want to keep that you can't find in the sorted bams? I am inclined to retrieve the original fastq files but data storage might be a problem for me.
            http://kevin-gattaca.blogspot.com/

            Comment

            • swbarnes2
              Senior Member
              • May 2008
              • 910

              #7
              I've use Picard on .bams generated by bwa/samtools, and it definately keeps the unmapped reads. But that's because the .bam has them. If you used an aligner that tossed them, or put them in another .bam (didn't bowtie used to do that be default?) Then there's nothing any software can do about that.

              I've never tried to get them back out as paired reads. I assume that it uses the flag to know which is read 1 and which is read 2, but it might not know to order them properly. If your .bam has all the reads sorted by name, and you haven't filtered out any single reads, I bet the fastqs would be in the right order.
              Last edited by swbarnes2; 11-16-2011, 10:41 AM.

              Comment

              • tsucheta
                Member
                • Nov 2009
                • 17

                #8
                Try using bam2fastq from hudsonalpha at http://www.hudsonalpha.org/gsl/software/bam2fastq.php. It is very quick (processed my bam files size ranging from 0.5 - 4 GB(8 files) in less than 10 minutes in a standard 2 core linux machine.)

                Comment

                • Johnnyalive
                  Junior Member
                  • Feb 2013
                  • 1

                  #9
                  Help using bamtools

                  I'm new to this and looking for help too - when I use bamtools to convert my .bam file to fastq, I only get one output file. Is it possible to split pair-ended reads into two output files? Can someone suggest a method?
                  Many thanks,
                  Johnny.

                  Comment

                  • vivek_
                    PhD Student
                    • Jul 2012
                    • 164

                    #10
                    You just specify two different output files like:

                    java picard-tools/SamToFastq.jar I=Input.bam F=seq1_1.fastq F2=seq1_2.fastq

                    You can also split these by read groups using additional command line arguments.

                    Comment

                    • abhinay
                      Junior Member
                      • Mar 2013
                      • 2

                      #11
                      TopHat

                      The following command in Tophat can convert bam to fastq (with basic settings)

                      bam2fastx -q -Q -A -o output.fastq input.bam

                      for more manipulation

                      bam2fastx [--fasta|-a|--fastq|-q] [--color] [-Q] [--sam|-s|-t]
                      [-M|--mapped-only|-A|--all] [-o <outfile>] [-P|--paired] [-N] <in.bam>

                      Note: By default, reads flagged as not passing quality controls are
                      discarded; the -Q option can be used to ignore the QC flag.

                      Use the -N option if the /1 and /2 suffixes should be appended to
                      read names according to the SAM flags

                      Comment

                      • amarth
                        Member
                        • Dec 2012
                        • 14

                        #12
                        Originally posted by abhinay View Post
                        The following command in Tophat can convert bam to fastq (with basic settings)

                        bam2fastx -q -Q -A -o output.fastq input.bam

                        for more manipulation

                        bam2fastx [--fasta|-a|--fastq|-q] [--color] [-Q] [--sam|-s|-t]
                        [-M|--mapped-only|-A|--all] [-o <outfile>] [-P|--paired] [-N] <in.bam>

                        Note: By default, reads flagged as not passing quality controls are
                        discarded; the -Q option can be used to ignore the QC flag.

                        Use the -N option if the /1 and /2 suffixes should be appended to
                        read names according to the SAM flags
                        I second that

                        Comment

                        • nahalm63
                          Junior Member
                          • Nov 2014
                          • 1

                          #13
                          Hi, I am new here. Can any one tell me what script you use to convert BAM files to FASTQ in PICARD? tnx



                          Originally posted by malachig View Post
                          After a quick search I found these:

                          Hydra
                          Picard (SAMToFastq)
                          HudsonAlpha
                          Possibly EMBOSS

                          Any comments on these? Any other options for BAM-to-FASTQ conversion?

                          Basically I want to recover all paired-end reads (both R1 and R2) that were fed into the alignment that produced the BAM file, whether they mapped successfully or not.

                          Comment

                          • blancha
                            Senior Member
                            • May 2013
                            • 367

                            #14
                            Code:
                            java -jar /usr/local/tools/picard-tools-1.114/SamToFastq.jar \
                            VALIDATION_STRINGENCY=SILENT \
                            INPUT=HI.1965.007.Index_1.FL_K562-110k-A.bam \
                            FASTQ=HI.1965.007.Index_1.FL_K562-110k-A_R1.fastq \
                            SECOND_END_FASTQ=HI.1965.007.Index_1.FL_K562-110k-A_R2.fastq \
                            &> bamtofastq.sh.log

                            Comment

                            • Thorondor
                              Member
                              • Feb 2011
                              • 69

                              #15
                              found this thread and decided to revive it.
                              Did anyone tried to get back to several fastq pairs r1 and r2 merged into one bam file. Alignment was done with bwa mem, merging with biobambam.
                              3 seperately sequenced lanes where the input.
                              Right now I use picard bam2fastq are there any other feasible options?
                              And do I really get back to the 100% identical fastq files which where the original input?

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                                by SEQadmin2


                                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                                Here are nine questions we think about, in roughly the order they matter, before...
                                06-18-2026, 07:11 AM
                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-26-2026, 11:10 AM
                              0 responses
                              12 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-17-2026, 06:09 AM
                              0 responses
                              46 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-09-2026, 11:58 AM
                              0 responses
                              106 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              125 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...