SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
HTSeq Script from DEXSeq Reports Assertion Fail in SAM file FuzzyCoder Bioinformatics 5 09-27-2011 08:52 AM
Assertion failed error in BFAST localalign seeker Bioinformatics 7 09-02-2011 09:33 PM
mpileup failed when using samtools.pl varFilter jianfeng.mao Genomic Resequencing 2 07-11-2011 08:49 AM
Samtools Pileup Assertion Error AnamikaDarwin Bioinformatics 2 06-29-2009 12:44 PM
maq map - Assertion 'matches' failed AnamikaDarwin Bioinformatics 1 12-10-2008 07:56 AM

Reply
 
Thread Tools
Old 03-06-2012, 12:21 AM   #21
gtyrelle
Member
 
Location: the Netherlands

Join Date: Feb 2011
Posts: 16
Default

The paper linked in my previous comment is a good starting point for understanding our assembly approach. Our assembly pipeline is not implemented in cgatools, again: cgatools does not call variants. The assembly pipeline is developed and run in-house on sequenced genomes, and the results (small variant calls, CNV, SV, MEIs) provided to customers. In other words, you don't need to "call variants" from CG data using open source software, as the variant calls are done for customers as part of the CG service data deliverables.

Currently there is no open source software that will provide equivalent quality variant calls from CG data. My advice would be to find the var and masterVar files (containing SNPs/Indels/Subs) from the CG genome assemblies you are working with and start from there.
__________________
Bioinformatics Applications, Europe
Lifetech Inc. http://www.lifetech.com/
gtyrelle is offline   Reply With Quote
Old 03-06-2012, 01:23 AM   #22
ritzriya
Member
 
Location: Canada

Join Date: Jun 2010
Posts: 49
Question

Thank you gtyrelle for the clear guidance about the CG data.
ritzriya is offline   Reply With Quote
Old 03-06-2012, 03:25 AM   #23
ritzriya
Member
 
Location: Canada

Join Date: Jun 2010
Posts: 49
Question

Hi gtyrelle,

I have understood that variants are a part of your CG deliverables. But also I want to know that the evidence2sam command which produces the SAM file, contains records/lines where a read maps to multiple locations, such as:

Quote:
GS27657-FS3-L02-8:1660459 179 chr14 19089795 16 12M1I1P4I6M6N5M1I4M = 19090178 383 CCTAATTCTTATTTTTATTTTTTTATTTATTTT 9::::656877887;<<<:::<;6-47737783 RG:Z:NA19238-L2-200-37-ASM-chr14 GC:Z:3S2G28S GS:Z:AAAA GQ:Z:::4-
GS27657-FS3-L02-8:1660459 115 chr14 19090178 16 10M5N23M = 19089795 -383 TTCATGAGAGGGTCCACTATTTTTCCCTTGTTA .08877587857;*;1<<9778877871;;::7 RG:Z:NA19238-L2-200-37-ASM-chr14 GC:Z:28S2G3S GS:Z:TGTG GQ:Z:77;;
Here, the read 'GS27657-FS3-L02-8:1660459' maps to the same chromosome (chr14) at two adjacent locations. Is there a way where we can obtain only unique mapping - i.e. one read maps to only one location in the genome using evidence2sam?

The reason why I am asking you about this is: that it causes a problem in the downstream analysis using GATK. GATK claims that it handles and finds polymorphisms from CG data. But it produced the following error on more than one test dataset I have tried:

Quote:
ERROR MESSAGE: SAM/BAM file SAMFileReader{CGA_test/originalSAM/output_sorted.bam} is malformed: Adjacent I/D events in read GS27657-FS3-L02-8:1660459
Please let me know if you think this is feasible. I merely want to understand the difference of the output from a public tool and the one from CG themselves. I hope you got my concern. Thanks in advance.
ritzriya is offline   Reply With Quote
Old 03-07-2012, 12:45 AM   #24
gtyrelle
Member
 
Location: the Netherlands

Join Date: Feb 2011
Posts: 16
Default

Hi ritzriya,

I was going to ask if you could repost this question in the Complete Genomics forum but I see you have already done that, thank you! I'll post my reply to the new thread.

Greg
__________________
Bioinformatics Applications, Europe
Lifetech Inc. http://www.lifetech.com/
gtyrelle is offline   Reply With Quote
Old 03-12-2012, 04:24 AM   #25
nora
Member
 
Location: Germany

Join Date: Dec 2009
Posts: 11
Default

Quote:
Originally Posted by ritzriya View Post
I converted the sam file using the awk command mentioned above. But for downstream analysis, I have to convert it into BAM file using samtools view. When I do the conversion, I get the following error, nora:
How to resolve this one?
Sorry ritzriya, I forgot you need to save the header of your file to a separate file and then concatenate them (e.g. with the cat command) after awk filtering.
nora is offline   Reply With Quote
Old 03-12-2012, 04:50 AM   #26
ritzriya
Member
 
Location: Canada

Join Date: Jun 2010
Posts: 49
Default

Quote:
Originally Posted by nora View Post
Sorry ritzriya, I forgot you need to save the header of your file to a separate file and then concatenate them (e.g. with the cat command) after awk filtering.
Oops, thank you. I observed that there were no headers in the new SAM file formed. Now I am able to create the BAM file successfully. Thanks again, nora.

Though this conversion of SAM to BAM is successful, I am back to square ONE! Oh god, I am getting the same error which I started to post with:

Quote:
$./samtools mpileup -uf hg1to24.fa CGA/output_sorted.bam | bcftools/bcftools view -vcg - > mpileup_view.bcf
[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000
samtools: bam_pileup.c:112: resolve_cigar2: Assertion `s->k < c->n_cigar' failed.
[afs] 0:13637.821 1:1240.324 2:329.855

Last edited by ritzriya; 03-12-2012 at 05:06 AM. Reason: Error not resolved!
ritzriya is offline   Reply With Quote
Old 10-31-2012, 09:42 AM   #27
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

Why not just fix cga pipeline so that the bam files are generated correctly?
Richard Finney is offline   Reply With Quote
Old 02-15-2013, 09:57 AM   #28
chemic91
Junior Member
 
Location: Pennsylvania

Join Date: Feb 2013
Posts: 1
Default

Quote:
Originally Posted by ritzriya View Post
Oops, thank you. I observed that there were no headers in the new SAM file formed. Now I am able to create the BAM file successfully. Thanks again, nora.

Though this conversion of SAM to BAM is successful, I am back to square ONE! Oh god, I am getting the same error which I started to post with:
ritzriya - Did you ever figure a workaround for this? I made the above changes and am also getting the same error again.

Last edited by chemic91; 02-15-2013 at 10:04 AM.
chemic91 is offline   Reply With Quote
Reply

Tags
cigar, complete genomics, mpileup, samtools

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:48 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO