Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bismark - extract genomic sequence from SAM / BAM format?

    Hello all,

    I am working with a bisulfite-converted library and I would like to extract the original genomic sequences for my reads (in other words, I would like to convert the bisulfite sequences back to regular genomic sequences). I have been using Bismark to align my reads to a reference genome. Bismark outputs a BAM file with the following fields (copied from the user guide):

    1. QNAME*(seq-ID)
    2. FLAG*(this flag tries to take the strand a bisulfite read originated from into account (this is different from ordinary DNA alignment flags!))
    3. RNAME*(chromosome)
    4. POS*(start position)
    5. MAPQ*(only calculated for Bowtie 2, always 255 for Bowtie)
    6. CIGAR
    7. RNEXT
    8. PNEXT
    9. TLEN
    10. SEQ
    11. QUAL*(Phred33 scale)
    12. NM-tag*(edit distance to the reference)
    13. MD-tag*(base-by-base mismatches to the reference)
    14. XM-tag (methylation call string)
    15. XR-tag*(read conversion state for the alignment)
    16. XG-tag (genome conversion state for the alignment)

    Field 10 is the actual read, i.e., the bisulfite read. Field 14 specifies the methylation call at each position, so my understanding is that those fields could be used together to infer the original sequence. What I am looking for is either:

    1. A way to have Bismark output the original genomic sequence (the older version actually did this - see note below),
    2. A script that will use the BAM output to convert the bisulfite sequences to genomic sequences, or
    3. Another bisulfite aligner that offers this functionality.

    Note: The original version of Bismark actually outputs a tab-delimited text file that contains the information I want - field 7 is the "original bisulfite read sequence" and field 8 is the "equivalent genomic sequence." Bismark allows users to request this output using the --vanilla call, however, it uses the older version of Bismark, which is only compatible with bowtie1. I am getting much better alignments with the newer version which uses bowtie2 and outputs BAM files that do not contain the genomic sequence, so I would prefer not to use the --vanilla call.

    Any help would be greatly appreciated.

    Thanks,
    Tricia

  • #2
    I suggested the option of using one of several BS-SNP callers (Bis-SNP, MethylExtract or BS-SNPer) via email, so I am hoping one of them will prove useful.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM
    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    17 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    22 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    16 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    46 views
    0 likes
    Last Post seqadmin  
    Working...
    X