SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Reverse engineering BAM files: BAM -> FASTQ (http://seqanswers.com/forums/showthread.php?t=16433)

gene coder 12-22-2011 05:19 PM

Reverse engineering BAM files: BAM -> FASTQ
 
How can I possibly extract the reads from a BAM file and put them into a FASTQ file for simulation (maq simutrain, then maq simulate)?

Should I just extract col. 1, 10 and 11 from a BAM file and put them in a text file along with a '+'? That is the output by read simulators.

But does it not happen that DNA sequences and base quality sequences are reversed and/or transformed depending on the direction and strand of reads relative to reference genomes? I would have to fix that too in that case.

raonyguimaraes 12-22-2011 05:57 PM

http://biostar.stackexchange.com/que...g-bam-to-fastq

:)

Richard Finney 12-22-2011 06:39 PM

Check out http://seqanswers.com/forums/showthread.php?t=16395 for bampe2fq.c and bamse2fq.c for fast implementations of bam to fastq programs. It handles your concerns about reverse complimenting the sequence and reversing the quality string.

gene coder 01-03-2012 03:42 PM

Thanks. I found that SamToFastq in Picard did the job on a chromosome of NA12878 from the 1000 Genomes Project.

1. I separated SE reads and PE reads by library into separate BAM files using SamTools.
2. For PE reads, I also had to get rid of read-pairs that were unmapped (bit 3) or whose mate was unmapped (bit 4) or that were not properly aligned (bit 2).
3. I further separated the BAM files by read length.
4. Each BAM file now contained SE or PE reads only of the same read length and the same library.
5. Now I could run SamToFastq to convert a SAM file to a DAT file for use with MAQ simutrain.

Those were the steps to do to get what I wanted.


All times are GMT -8. The time now is 01:26 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.