Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • pietro2580
    replied
    Thanks for the suggestion!
    Tried to re-run the whole pipeline, starting from the BWA alignment (including the faidx indexing of the reference sequence), but no luck.
    Moreover, mpileup - bcf - varfilter is able to predict around 81% (1891 out of 2321) of the present indels correctly, and for around 15% it's able to predict the correct "region" (+- 5 nt), but does not report the correct indel.

    Seems that for these 15% indels, there are other indels reporterd in the neighborhood of the "reported" indel in the total mpileup-output (before running varFilter), and one of these other indels (not reported) seem to be the true one.
    A few lines of the complete mpileup-output is given below, the reported deletion is -GT (fourth line) while the correct one is -TT (last line). The wrong one might be selected because it has more reads supporting it (40 + 32 versus 38 + 30).

    Code:
    NC_000913       84533   .       G       .       209     .       DP=109;VDB=0.0392;AF1=0;AC1=0;DP4=52,55,0,2;MQ=48;FQ=-282;PV4=0.5,1,0.075,1     PL      0
    NC_000913       84533   .       GCT     G       54.5    .       INDEL;DP=109;VDB=0.0808;AF1=1;AC1=2;DP4=1,5,21,14;MQ=57;FQ=-63.5;PV4=0.08,0.27,1,1      GT:PL:GQ        1/1:95,29,0:55
    NC_000913       84534   .       C       .       209     .       DP=104;VDB=0.0318;AF1=0;AC1=0;DP4=50,52,1,1;MQ=48;FQ=-282;PV4=1,1,0.081,0.076   PL      0
    NC_000913       84534   .       CTGT    CT      214     .       INDEL;DP=104;VDB=0.0777;AF1=1;AC1=2;DP4=0,0,40,32;MQ=52;FQ=-252 GT:PL:GQ        1/1:255,217,0:99
    NC_000913       84535   .       T       .       207     .       DP=96;VDB=0.0413;AF1=0;AC1=0;DP4=48,47,0,1;MQ=47;FQ=-282;PV4=1,0,0.17,0.15      PL      0
    NC_000913       84536   .       G       .       205     .       DP=96;VDB=0.0639;AF1=0;AC1=0;DP4=47,45,0,0;MQ=47;FQ=-282        PL      0
    NC_000913       84536   .       GTT     G       214     .       INDEL;DP=96;VDB=0.0777;AF1=1;AC1=2;DP4=0,0,38,30;MQ=52;FQ=-240  GT:PL:GQ        1/1:255,205,0:99

    Leave a comment:


  • swbarnes2
    replied
    On possibility; I'm wondering if all your software is on the same page with regard to what the reference sequence actually is.

    You know that samtools mpileup uses an index for your fasta, and if you changed your fasta, and didn't remake this index, that might explain why the software is confused about what the sequence exactly is there.

    So, try remaking that index with samtools faidx, then rerun samtools mpileup.

    Leave a comment:


  • SAMtools mpileup prediction - few nucleotides off

    Hi all,

    I am using samtools - mpileup (version 0.1.18) to predict SNP and indels in bacterial genomes. When applying this pipeline to artificially generated Illumina data (using dwgsim), it seems to me that mpilup is able to predict the region where indels appear, but is reporting in some cases the wrong indel. In total around 2000 indels are reported, and for ~ 300 of them, a behavior like below is observed.

    E.g. on position 3.211.144, dwgsim artificially introduced a deletion of 3 nucleotides (-GAG) in the reference sequence.

    However, mpileup is giving on position 3211140 as ref TGCCG and as alternative TG, which would result in a deletion of CCG:
    ref_id 3211140 . TGCCG TG 214 . INDEL;DP=97;VDB=0.0710;AF1=1;AC1=2;DP4=0,0,24,28;MQ=39;FQ=-192 GT:PL:GQ 1/1:255,157,0:99

    However, when looking at the alignment of the genes in the BAM file (with tview), this does not seem to be correct (see attached).

    It seems to me that samtools is predicting the indel correctly, but is rather making an error in reporting, by reporting an indel with the same size a few nucleotides upstream of the real indel location. Or am I missing something?

    Thank you in advance!

    Pieter
    Attached Files

Latest Articles

Collapse

  • seqadmin
    Investigating the Gut Microbiome Through Diet and Spatial Biology
    by seqadmin




    The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
    02-24-2025, 06:31 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 03-03-2025, 01:15 PM
0 responses
160 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-28-2025, 12:58 PM
0 responses
247 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-24-2025, 02:48 PM
0 responses
619 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-21-2025, 02:46 PM
0 responses
265 views
0 likes
Last Post seqadmin  
Working...
X