Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Read mapped in proper pair (0x2) flag in SMALT aligner

    Hi,

    I used SMALT to align MiSeq PE reads to a single-strand, unsegmented viral reference genome. When I looked into the results, I found some pairs were not flagged as "read mapped in proper pair (0x2)," but it seemed to me that they should be proper pairs.

    I listed two pairs that I think should be proper pairs but did not get the 0x2 flag as below. Reads were adapter-trimmed and quality-trimmed by Trimmomatic before aligned by SMALT, so the length was different between reads.

    SMALT commands:
    smalt map -x -y 0.7 -j 0 -i 2000 -o sample_X.sam ref_index read_1.fastq read_2.fastq

    First one:
    MISEQ:1:1101:2325:10343 81 ref 5130 29 2S210M = 5130 210
    GCAGCCAAGCACAAAACCACGTCCAAAAAATCCACCAAAAAAAGATGATTACCATTTTGAAGTGTTCA
    ACTTTGTTCCCTGTAGTATATGTGGCAACAATCAACTCTGCAAATCCATTTGCAAAACAATACCAAGC
    AACAAACCAAAAAAAAAACCAACTACAAAACCCACAAACAAACCACCCACCAAAACCACAAACAAAAG
    AGACCCCA

    5GE6>99C9C=,8@BCGGFB6:>EECFCF:F7FFCF<:9FGFFDGD@FECEC,FCCECFB9GFCCFDC
    <=FFD<EFGGGGEAFFF<?,B,<EFFGCFFD@ADFGFEEA9GGFAGFFCE<@GFF9EFF<C5,FC,9E
    C<EE7GGGEGGGGGGFD,GFF6EFFGGFCCGGFEFFCEGEGGGGFGFDCGFF@GFGGGGGGFFDDGGF
    9GFCCCCC
    NM:i:1 AS:i:207

    MISEQ:1:1101:2325:10343 161 ref 5130 29 3S159M = 5130 -210
    GGCAGCCAAGCACAAAACCACGTCCAAAAAATCCACCAAAAAAAGATGATTACCATTTTGAAGTGTTC
    AACTTTGTTCCCTGTAGTATATGTGGCAACAATCAACTCTGCAAATCCATTTGCAAAACAATACCAAG
    CAACAAACCAAAAAAAAAACCAACTA

    CCCCCGGGGG<FGGGGGGDGGGCCFGGGGGGFGFGDFGGGGGGGGGECFE,EDCFFFAECEGFCFACF
    GGDEF,@FFDEFFCFGGGGGGF9F<A=EFCFGF8FFFCFCCDFGGG=FGGCFGGCEFCGEFDFF8FG<
    FFG:FFF<FEGGGGGGGGGC8EG8,>
    NM:i:1 AS:i:15


    Second one:
    MISEQ:1:1101:2472:14296 81 ref 4225 29 4S260M = 4225 -260
    GCGGGGGGTAAATAGATATCAGTTAGAGTTTAACCAATCTTAACAACCATCTATACCGCCAATCCAAT
    ACATACATTGCAAATCTTAAAATGGGAAACACATCCATCACAATAGAATTCACAAGCAAATTTTGGCC
    CTATTTTACACTAATATATATGATCCTAACTCTAATCTCTTTACTAATTATAATCACCATTATGATTG
    CAATACTAAATAAGCTAAGTGAACATAAAATATTCTGCAACAAGACTCTTGAACAAGGAC

    >9970F7FGGGGGGGGGGGFGEGGDFGED8GGFGFCGGFAGF8EEF?GGGGFFEGE=??GGFGGGGGE
    EGGGGGDFGGDGFGFFEGGGGGGECFFFCGGGGFGFFGGF9GFGGGFGFFGGFGFAAF8GF<@FB<CF
    FGGGGGFGGFDFGGGGGGGF@FC<CGGGGFADGGFCGFAAEDGGGGGGGGGGGDGGGGGGGGGGGFGG
    GFDGGFAGGGGF@DGGGFGFCFFFEFDFAFAFEAGGGGGGGGEECGFF<AGGGGGCCCCC
    NM:i:1 AS:i:257

    MISEQ:1:1101:2472:14296 161 ref 4225 29 4S260M = 4225 260
    GCGGGGGGTAAATAGATATCAGTTAGAGTTTAACCAATCTTAACAACCATCTATACCGCCAATCCAAT
    ACATACATTGCAAATCTTAAAATGGGAAACACATCCATCACAATAGAATTCACAAGCAAATTTTGGCC
    CTATTTTACACTAATATATATGATCCTAACTCTAATCTCTTTACTAATTATAATCACCATTATGATTG
    CAATACTAAATAAGCTAAGTGAACATAAAATATTCTGCAACAAGACTCTTGAACAAGGAC

    CCCCCGGGGFGFGFGFFG,CFGGFGGGGDFFFFE?FGGGFGAFBFGFGGGGGGGGF9FFGGGGGCGGF
    GGGGGGGGF9FFGFGGGGGGGCFGG8FFFGGGGGGCFFGGGF9=FEGGGGFGFFGCBCFGFGB,@FFG
    FGGGGGEF?@FFFGEGGFCFGG9ECCFGGG;9;FFGGGGG9F9CFG?C9CFGFFGCF??9CGE?GGCE
    FFGGGGGGGGGGCFCEFF9C*0:FGC7CFBGGGGFFFFFCD555CFFFFF*=?FFFF>?F
    NM:i:1 AS:i:257


    Flag 81: read paired (0x1), read reverse strand (0x10), first in pair (0x40)
    Flag 161: read paired (0x1), mate reverse strand (0x20), second in pair (0x80)

    Does anyone have any idea why they were not flagged as proper flags? I have not managed to figure it out, so any thought would be greatly helpful.

    Best,
    Michael
    Last edited by PBMS; 12-31-2018, 05:51 AM.

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin


    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
    Yesterday, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
39 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
41 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
35 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
55 views
0 likes
Last Post seqadmin  
Working...
X