Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • samtools fixamte appears to be doing nothing

    I have to do some filtering of reads in BAM alignments based on mapping quality but also to remove some reads that overlap certain regions of the reference sequence. The filter steps are working fine but I'm having trouble with the downstream processing.

    The input is paired-end reads but after one of those filtering steps, I often have only one single read from a pair left because the mate was filtered out. I would then like to modify the flags to correctly reflect the fact that this is a singleton without a mate.

    I thought this is exactly what the fixmate command in samtools is supposed to do but it doesn't seem to do anything for me.

    I name-sort the BAM files and then run

    Code:
    samtools fixmat in.namesorted.bam out.bam
    But I can see absolutely no change to the singelton reads at all.

    Here is an example, this is the tail of the name sorted BAM file before and after running through fixmate. The last read in the file is a singleton now that used to have a mate. Running fixmate has changed nothing and its flag still shows it as a paired read.

    name sorted BAM before fixmate
    Code:
    read9955        99      100001  1907    24      150M    =       2207    450     CCATATGATGTTCCAGATTATGCATATCCATATGATGTTCCAGATTATGCATAAGGGCCCCGTTTTTCTTACTTATATATTTATACCAATTGATTGTATTTATAACTGTAAAAATGTGTATGTTGTGTGCATATTTTTTTTTGTGCATGC   IFHDGFDHEIEFGFFHDGEIFGDFEFDDIIDGDHIDIEIEEEIIEIEDHDDIEIGEFIEIFGGGIDHIDGHDEDIDEHGIDEFGEFIEGIIIFFGHGHFGEEHFEDIIIHDEDEIGGFGHIIIGEEGGIDEDHEDFDEGGGEGDFHGFFE   NM:i:0  AS:i:150
    read9955        147     100001  2207    22      150M    =       1907    -450    ATGGAACATAATATGTTTGAAACAATAAGACAAAATTATTATTATTATTATTATTTTTACTGTTATAATTATGTTGTCTCTTCAATGATTCATAAATAGTTGGACTTGATTTTTAAAATGTTTATAATATGATTAGCATAGTTAAATAAA   HGFIIHGEFDEHDDIDEIIDIHDHHFHDEHHIFGEIEHDDGHFIHHEEIGIFHEEEFFFDFGEEIGDGDIDGGIEHGEHGGEEIEDIDGFIDEGDEIDEHFGEHEEGFHGDFDFEFDHGEEIHDGHDGGEEFIGFHHFHHHHHEFEIHGE   NM:i:0  AS:i:150
    read10095       99      100001  1891    24      150M    =       2191    450     TCCAGATTATGCATATCCATATGATGTTCCAGATTATGCATATCCATATGATGTTCCAGATTATGCATAAGGGCCCCGTTTTTCTTACTTATATATTTATACCAATTGATTGTATTTATAACTGTAAAAATGTGTATGTTGTGTGCATAT   IHIHDIIEGGHGDDEEEHHFHDIIHFDEEGFGFGDFHDGDDFHFGIHGGDEEIGHIIDFIFHHIGDDHFIIIFHGEDDIDGFFFDGIDEEHFFGDFGGDIDEDIHIGDDEFDEDDHDHHHFDFFDFGHFFFIGIFFIEGDGHHFHGDDGH   NM:i:0  AS:i:150
    read10095       147     100001  2191    22      150M    =       1891    -450    AAAAATTTAGATATTTATGGAACATAATATGTTTGAAACAATAAGACAAAATTATTATTATTATTATTATTTTTACTGTTATAATTATGTTGTCTCTTCAATGATTCATAAATAGTTGGACTTGATTTTTAAAATGTTTATAATATGATT   IGGHHIIHGEGFHFFHIHHHGGFEIFEDHHEIDGHIGFDDDGFIEGEDEGHHDIIEHEFHDIIDDHDHIEDHGEHFIIHDHHIEHEEHHHGFHGGFGGIIFDFGIIGDFHEDHGDDDFHFFFGDIDIIGFEFGHEEHFFHGEEEDDIIDF   NM:i:0  AS:i:150
    read10512       147     100001  2171    22      150M    =       1871    -450    TATGAATTTATGACCATATTAAAAATTTAGATATTTATGGAACATAATATGTTTGAAACAATAAGACAAAATTATTATTATTATTATTATTTTTACTGTTATAATTATGTTGTCTCTTCAATGATTCATAAATAGTTGGACTTGATTTTT   IDEGHFDIHIEEGGEDIDGGDHDHIDFDDHDHGGDGDEGDHEIHHDDEEGIEFFGGGIDHHHDFGEDEHGGIDHFFEGGIHFEGIIIGEFDFIFEEIHIHDGDHEEFEDEHIIDDEHDIIDDGDDDDHGFEDIGDEFFEGDEDIDDEIDG   NM:i:0  AS:i:150
    After fixmate
    Code:
    read9955        99      100001  1907    24      150M    =       2207    450     CCATATGATGTTCCAGATTATGCATATCCATATGATGTTCCAGATTATGCATAAGGGCCCCGTTTTTCTTACTTATATATTTATACCAATTGATTGTATTTATAACTGTAAAAATGTGTATGTTGTGTGCATATTTTTTTTTGTGCATGC   IFHDGFDHEIEFGFFHDGEIFGDFEFDDIIDGDHIDIEIEEEIIEIEDHDDIEIGEFIEIFGGGIDHIDGHDEDIDEHGIDEFGEFIEGIIIFFGHGHFGEEHFEDIIIHDEDEIGGFGHIIIGEEGGIDEDHEDFDEGGGEGDFHGFFE   NM:i:0  AS:i:150        CT:Z:1F150M150T2R150M
    read9955        147     100001  2207    22      150M    =       1907    -450    ATGGAACATAATATGTTTGAAACAATAAGACAAAATTATTATTATTATTATTATTTTTACTGTTATAATTATGTTGTCTCTTCAATGATTCATAAATAGTTGGACTTGATTTTTAAAATGTTTATAATATGATTAGCATAGTTAAATAAA   HGFIIHGEFDEHDDIDEIIDIHDHHFHDEHHIFGEIEHDDGHFIHHEEIGIFHEEEFFFDFGEEIGDGDIDGGIEHGEHGGEEIEDIDGFIDEGDEIDEHFGEHEEGFHGDFDFEFDHGEEIHDGHDGGEEFIGFHHFHHHHHEFEIHGE   NM:i:0  AS:i:150
    read10095       99      100001  1891    24      150M    =       2191    450     TCCAGATTATGCATATCCATATGATGTTCCAGATTATGCATATCCATATGATGTTCCAGATTATGCATAAGGGCCCCGTTTTTCTTACTTATATATTTATACCAATTGATTGTATTTATAACTGTAAAAATGTGTATGTTGTGTGCATAT   IHIHDIIEGGHGDDEEEHHFHDIIHFDEEGFGFGDFHDGDDFHFGIHGGDEEIGHIIDFIFHHIGDDHFIIIFHGEDDIDGFFFDGIDEEHFFGDFGGDIDEDIHIGDDEFDEDDHDHHHFDFFDFGHFFFIGIFFIEGDGHHFHGDDGH   NM:i:0  AS:i:150        CT:Z:1F150M150T2R150M
    read10095       147     100001  2191    22      150M    =       1891    -450    AAAAATTTAGATATTTATGGAACATAATATGTTTGAAACAATAAGACAAAATTATTATTATTATTATTATTTTTACTGTTATAATTATGTTGTCTCTTCAATGATTCATAAATAGTTGGACTTGATTTTTAAAATGTTTATAATATGATT   IGGHHIIHGEGFHFFHIHHHGGFEIFEDHHEIDGHIGFDDDGFIEGEDEGHHDIIEHEFHDIIDDHDHIEDHGEHFIIHDHHIEHEEHHHGFHGGFGGIIFDFGIIGDFHEDHGDDDFHFFFGDIDIIGFEFGHEEHFFHGEEEDDIIDF   NM:i:0  AS:i:150
    read10512       147     100001  2171    22      150M    =       1871    -450    TATGAATTTATGACCATATTAAAAATTTAGATATTTATGGAACATAATATGTTTGAAACAATAAGACAAAATTATTATTATTATTATTATTTTTACTGTTATAATTATGTTGTCTCTTCAATGATTCATAAATAGTTGGACTTGATTTTT   IDEGHFDIHIEEGGEDIDGGDHDHIDFDDHDHGGDGDEGDHEIHHDDEEGIEFFGGGIDHHHDFGEDEHGGIDHFFEGGIHFEGIIIGEFDFIFEEIHIHDGDHEEFEDEHIIDDEHDIIDDGDDDDHGFEDIGDEFFEGDEDIDDEIDG   NM:i:0  AS:i:150

    The manual for samtools is a bit short on details about what exactly this tools is supposed to do other than to "fill in mate-related flags". I guess I'm not sure what that means? Can anybody share some light on this or recommend a different way of doing this?


    EDIT
    ======================

    Just noticed that fixmate has actually made a change but it only appears to have added some optional tags int the last column of the sam file. I had expected it to work on the FLAG field but maybe that's not what it is meant to do at all(?)
    Last edited by tospo; 12-10-2013, 06:37 AM. Reason: additional information

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 08:47 AM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
59 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
54 views
0 likes
Last Post seqadmin  
Working...
X