Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • qixiaofei
    Junior Member
    • Sep 2011
    • 4

    The 'S' in CIGAR of sam file (bwa)

    the S means "soft clipping (clipped sequences present in SEQ)".
    but I saw an example of CIGAR which is "72M28S" (4mismatch),and actually, there is only one mismatch in the 28S! I doubt why the result of aln is not 100M (5mismatch)? I saw Many similar situation ? who can help me ? thank U~
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    Can you show is the read and alignment?

    Comment

    • qixiaofei
      Junior Member
      • Sep 2011
      • 4

      #3
      Originally posted by nilshomer View Post
      Can you show is the read and alignment?
      B80FGAABXX:6:23:6777:40030#0 83 chr10 42383349 10 28S72M = 42383248 -173 AATCAGATGGAATCATCGAATGGACTTGAATGGAATCGTTGAATGGACTCGAATGGAATCATTATTGAATGGAATTGAATAGAATCATCGAATGGTCTCG ??A??ED=E:ECD=?EDD:AEEEEEDFDGEDGFGEDFGEGEEGDGGBGGEDGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGGGGGGGGGGGGGGG XT:A:M NM:i:5 SM:i:10 AM:i:10 XM:i:5 XO:i:0 XG:i:0 MD:Z:9A2A39G5T2C10
      B80FGAABXX:6:25:6662:82018#0 147 chr10 42383349 10 30S70M = 42383248 -171 GGAATCAGATGGAATCATCGAATGGACTGGAATGGAATCATTGAATGGACTCGAAAGGGATCATTATTGAATGGAATTGAATGGAAGCATCGAATGGTCT DEFEAGFEGFFEF?BGGG?GDFFGEEGDFFGEGFDGGGGGGGGFGGGGFGGGEGGGGGGFEGGGGGGFGGGGGGGFEGGGGGGGGGGGGGGGGGGGGGFG XT:A:M NM:i:6 SM:i:10 AM:i:10 XM:i:6 XO:i:0 XG:i:0 MD:Z:12A12T2A27T1T2C8

      I give two lines , and the ref before the pos 42383349 is ggaatcagatggaatcatcgaatggacttt(30bp) only the last 2 bp is mismatch with the second read given.

      Comment

      • nilshomer
        Nils Homer
        • Nov 2008
        • 1283

        #4
        I agree, looking at BLAT it seems as though this read is quite repetitive too, and is not annotated as such. It would be great to post to the bwa mailing list to file a bug.

        What is your mismatch tolerance (show the command line)? BWA (short) will try to align a prefix of the read, and it may be hitting its mismatch/indel tolerance limit.

        Comment

        • qixiaofei
          Junior Member
          • Sep 2011
          • 4

          #5
          Originally posted by nilshomer View Post
          I agree, looking at BLAT it seems as though this read is quite repetitive too, and is not annotated as such. It would be great to post to the bwa mailing list to file a bug.

          What is your mismatch tolerance (show the command line)? BWA (short) will try to align a prefix of the read, and it may be hitting its mismatch/indel tolerance limit.
          I only have the results, and don't know the command line. but there is "XM:i:22" on other line of the sam . the example I gave before only "XM:i:5"and "XM:i:6",respectively. I am a beginner, don't know the principle of bwa. why it happens?

          Comment

          • nilshomer
            Nils Homer
            • Nov 2008
            • 1283

            #6
            You will have to play with it yourself and read the paper. Unfortunately, it is beyond the scope of this forum for me to try to debug and explain it to you. Can you get the command line?

            Comment

            • qixiaofei
              Junior Member
              • Sep 2011
              • 4

              #7
              Originally posted by nilshomer View Post
              You will have to play with it yourself and read the paper. Unfortunately, it is beyond the scope of this forum for me to try to debug and explain it to you. Can you get the command line?
              I'm sorry for replying so late .
              the command line as follows
              bwa aln -n 3 -o 1 -e 15 -i 5 -l 32 -t 4

              Comment

              Latest Articles

              Collapse

              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM
              • SEQadmin2
                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                by SEQadmin2


                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                ...
                06-02-2026, 10:05 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Today, 11:10 AM
              0 responses
              6 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              41 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-09-2026, 11:58 AM
              0 responses
              102 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-05-2026, 10:09 AM
              0 responses
              123 views
              0 reactions
              Last Post SEQadmin2  
              Working...