![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
bam file and vcf wildly disagree | Mark | Bioinformatics | 0 | 05-25-2016 06:02 PM |
Best tool to view sequence alignment? | cacti | Bioinformatics | 1 | 07-31-2015 04:01 PM |
View column header in VCF file | Claudius | Bioinformatics | 0 | 07-09-2015 08:38 AM |
How to view only the variants that are present in multiple VCF files? | ronton | Bioinformatics | 2 | 10-15-2014 04:18 PM |
MEGAN view alignment problems | OllyBolly | Bioinformatics | 2 | 07-08-2013 12:29 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Raleigh, NC Join Date: Nov 2008
Posts: 51
|
![]()
I have a pacbio CCS fastq file that I have aligned to a small (5037 bp) reference which is part of a sequenced BAC clone. I have aligned this data with multiple aligners (minimap2 and smalt), called variants with multiple variant callers (bbmap, freebayes, and gatk) and viewed the bam file with multiple viewers (igv and tablet). All produce similar results. I will focus here on minimap2 and callvariants (bbmap), as the problem likely stems from the variant caller and seqanswers is the preferred avenue for support for bbmap.
Here is what I did: Code:
minimap2 -t 5 -ax asm20 reference.fa sample.ccs.fastq > out.sam samtools view -O BAM -o out.bam out.sam && samtools sort -o out.sort.bam out.bam && samtools index out.sort.bam callvariants.sh ploidy=1 in=out.sort.bam ref=reference.fa vcf=out.vcf Code:
1 of 3311 variants passed primary filters (0.0302%). Type Count Rate AD Depth AF Score Qual Substitutions: 0 0.0% 0.0 0.0 0.000 0.0 0.0 Deletions: 1 100.0% 619.0 1000.0 0.619 23.0 34.0 Insertions: 0 0.0% 0.0 0.0 0.000 0.0 0.0 Variation Rate: 1/5037 Homozygous: 1 100.0% Code:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT out.sort reference 1401 . Tc T 23.66 PASS SN=0;STA=1401;STO=1402;TYP=DEL;R1P=338;R1M=281;R2P=0;R2M=0;AD=619;DP=1000;MCOV=-1;PPC=0;AF=0.6190;RAF=0.6190;LS=2996497;MQS=37140;MQM=60;BQS=21455;BQM=41;EDS=956749;EDM=4467;IDS=617766;IDM=999;NVC=0;FLG=0;CED=601;HMP=6;SB=0.9997 GT:DP:AD:AF:RAF:NVC:FLG:SB:SC:PF 1:1000:619:0.6190:0.6190:0:0:0.9997:23.66:PASS Now the strange part. When I view the alignment (out.sort.bam) in IGV (or Tablet), I see at most 1-2% indels at our near this site while the rest of the sequence is largely free of indels. Why is there this large discrepancy between what is observed in an alignment viewer and what is called by the variant caller (i.e. there are many more variant reads called than are observed on the viewer? Thank you for your help Last edited by Mark; 09-28-2020 at 12:37 PM. Reason: Added [code] tags |
![]() |
![]() |
![]() |
#2 |
Member
Location: Raleigh, NC Join Date: Nov 2008
Posts: 51
|
![]()
Turns out the problem was, not surprisingly, in IGV. By default IGV will not show one base pair deletions for what it perceives as PacBio sequence. I had unchecked the "hide indels" box in the 'alignment' preferences but there is a equivalent checkbox in the ' Third Gen' preferences I had overlooked. Uncheck it and viola.
|
![]() |
![]() |
![]() |
Tags |
bbmap, freebayes, igv, minimap2, smalt |
Thread Tools | |
|
|