Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
SAM/BAM format to wiggle format pinki999 Bioinformatics 19 08-12-2015 01:35 AM
Discrepancy between Bismark SAM file and bismark methylation extractor blancha Bioinformatics 4 07-02-2015 07:08 AM
Extract subalignment from SAM/BAM Mesmer Bioinformatics 4 01-17-2015 04:00 AM
How to extract unmapped reference sequence from BAM/SAM file.? Pinal Bioinformatics 2 03-06-2014 01:40 PM
Extract aligned sequence coordinates from SAM or BAM file pirates.genome Bioinformatics 5 08-20-2012 09:06 AM

Thread Tools
Old 12-20-2017, 03:12 PM   #1
Junior Member
Location: Michigan

Join Date: Dec 2017
Posts: 1
Default Bismark - extract genomic sequence from SAM / BAM format?

Hello all,

I am working with a bisulfite-converted library and I would like to extract the original genomic sequences for my reads (in other words, I would like to convert the bisulfite sequences back to regular genomic sequences). I have been using Bismark to align my reads to a reference genome. Bismark outputs a BAM file with the following fields (copied from the user guide):

1. QNAME*(seq-ID)
2. FLAG*(this flag tries to take the strand a bisulfite read originated from into account (this is different from ordinary DNA alignment flags!))
3. RNAME*(chromosome)
4. POS*(start position)
5. MAPQ*(only calculated for Bowtie 2, always 255 for Bowtie)
10. SEQ
11. QUAL*(Phred33 scale)
12. NM-tag*(edit distance to the reference)
13. MD-tag*(base-by-base mismatches to the reference)
14. XM-tag (methylation call string)
15. XR-tag*(read conversion state for the alignment)
16. XG-tag (genome conversion state for the alignment)

Field 10 is the actual read, i.e., the bisulfite read. Field 14 specifies the methylation call at each position, so my understanding is that those fields could be used together to infer the original sequence. What I am looking for is either:

1. A way to have Bismark output the original genomic sequence (the older version actually did this - see note below),
2. A script that will use the BAM output to convert the bisulfite sequences to genomic sequences, or
3. Another bisulfite aligner that offers this functionality.

Note: The original version of Bismark actually outputs a tab-delimited text file that contains the information I want - field 7 is the "original bisulfite read sequence" and field 8 is the "equivalent genomic sequence." Bismark allows users to request this output using the --vanilla call, however, it uses the older version of Bismark, which is only compatible with bowtie1. I am getting much better alignments with the newer version which uses bowtie2 and outputs BAM files that do not contain the genomic sequence, so I would prefer not to use the --vanilla call.

Any help would be greatly appreciated.

tricia.rubi is offline   Reply With Quote
Old 12-24-2017, 01:09 PM   #2
Senior Member
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 622

I suggested the option of using one of several BS-SNP callers (Bis-SNP, MethylExtract or BS-SNPer) via email, so I am hoping one of them will prove useful.
fkrueger is offline   Reply With Quote

bismark, bisulfite, bisulphite, methylation

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 01:50 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO