SEQanswers

Go Back   SEQanswers > Literature Watch



Similar Threads
Thread Thread Starter Forum Replies Last Post
CREST structural variation software error bubfranks Bioinformatics 10 06-24-2013 11:05 AM
CREST structural variation software .2bit file error M Stan Bioinformatics 4 12-22-2011 12:30 PM
CREST structure variation software exit information Delphine Song Bioinformatics 1 09-29-2011 08:12 PM
1000 Genomes paper out today in Nature brasj Literature Watch 1 10-28-2010 05:47 AM
Nature Methods: A genome in time lcollado Literature Watch 1 03-02-2010 04:22 AM

Reply
 
Thread Tools
Old 06-29-2011, 09:35 PM   #1
Michael.James.Clark
Senior Member
 
Location: Palo Alto

Join Date: Apr 2009
Posts: 213
Default Nature Methods AOP: CREST maps somatic structural variation in cancer genomes...

http://www.nature.com/nmeth/journal/...meth.1628.html

Yet another SV detection program. Of course, this one claims to be the best.

I think this may be best utilized in conjunction with algorithms that utilize discordant paired reads like my own BreakWay or Breakdancer or other algorithms like BreakSeq. Also, with Pindel. They do seem to make a case that the best results are obtained by looking for events that are identified by multiple methods.

I do want to point out also that some breakpoints have been identified to the nucleotide level in next gen data in the past. I think the statement about it in the paper is a bit misleading, actually, but perhaps that's true for the couple of examples they put. Still, it's not the first time it's been done nor the only program capable of it (though CREST may be the best at it, and they make a strong case for that).

Regardless, an interesting paper and yet another tool to implement in the murky world of SV detection!
__________________
Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
Projects: U87MG whole genome sequence [Website] [Paper]
Michael.James.Clark is offline   Reply With Quote
Old 03-08-2012, 08:37 AM   #2
aquinom85
Research Bioinformaticist
 
Location: Boston

Join Date: Dec 2011
Posts: 19
Default

Have you used it on your own data at all?

After almost a full day of installiing and getting all the dependencies running I was able to use their sample files, but when I tried to generate the inputs with extractSClip.pl i came back the next morning with a nice
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).

but the header looked fine to me:
aquinom@ubuntu:~/Crest$ samtools view -H ~/.gvfs/workorders\ on\ kcifs03/0135/seqdata/SS6002602/Assembly/genome/bam/SS6002602.bam
@HD VN:1.0 SO:coordinate
@PG ID:CASAVA VN:CASAVA-1.8.0a19 CL:/illumina/development/casava/CASAVA-1.8.0a19_NMNM_BAMFIX/bin/configureBuild.pl --targets all bam --inSampleDir=../Aligned/B0205ACXX/Sample_SS6002602 --outDir=/isilon/RUO/Projects/Knome/SS6002602/Assembly --samtoolsRefFile=/isilon/Genomes/FASTA_UCSC/HumanNCBI37_UCSC/HumanNCBI37_UCSC_XY.fa --jobsLimit=40 --variantsPrintUsedAlleleCounts --variantsWriteRealigned --sortKeepAllReads --bamChangeChromLabels=OFF --sgeQueue=all.q --tempDir=/state/partition1
@SQ SN:chr1 LN:249250621
@SQ SN:chr2 LN:243199373
@SQ SN:chr3 LN:198022430
@SQ SN:chr4 LN:191154276
@SQ SN:chr5 LN:180915260
@SQ SN:chr6 LN:171115067
@SQ SN:chr7 LN:159138663
@SQ SN:chrX LN:155270560
@SQ SN:chr8 LN:146364022
@SQ SN:chr9 LN:141213431
@SQ SN:chr10 LN:135534747
@SQ SN:chr11 LN:135006516
@SQ SN:chr12 LN:133851895
@SQ SN:chr13 LN:115169878
@SQ SN:chr14 LN:107349540
@SQ SN:chr15 LN:102531392
@SQ SN:chr16 LN:90354753
@SQ SN:chr17 LN:81195210
@SQ SN:chr18 LN:78077248
@SQ SN:chr20 LN:63025520
@SQ SN:chrY LN:59373566
@SQ SN:chr19 LN:59128983
@SQ SN:chr22 LN:51304566
@SQ SN:chr21 LN:48129895
@SQ SN:chrM LN:16571

Has anyone had any luck with this?
aquinom85 is offline   Reply With Quote
Old 03-08-2012, 10:31 AM   #3
ben.weisburd
Junior Member
 
Location: San Francisco

Join Date: Oct 2010
Posts: 9
Default

I've run CREST successfully before, and the only difference I can see is that my bam also had a read group:
@RG ID:1 PL:illumina PU:1 LB:1 SM:Sample_11p

Not sure if that's the issue, but...
picard AddOrReplaceReadGroups is a good tool for adding this.
ben.weisburd is offline   Reply With Quote
Old 03-08-2012, 11:45 AM   #4
aquinom85
Research Bioinformaticist
 
Location: Boston

Join Date: Dec 2011
Posts: 19
Default

Thanks, I'll try that if I get the error again. Did it take you a very long time to run the ./extractSClip.pl on your bams?
aquinom85 is offline   Reply With Quote
Old 03-08-2012, 12:35 PM   #5
ben.weisburd
Junior Member
 
Location: San Francisco

Join Date: Oct 2010
Posts: 9
Default

No not long - about 1 hr for a 7Gb bam.
ben.weisburd is offline   Reply With Quote
Old 03-08-2012, 02:38 PM   #6
aquinom85
Research Bioinformaticist
 
Location: Boston

Join Date: Dec 2011
Posts: 19
Default

Heh WGS gives 180-200G bams, so I guess ~24h but if I run in parallel maybe ~6 on a quad core. Alas, I'm using bams that were aligned with CASAVA and thus I don't think they have any soft-clipped reads...are those only available if a genome is aligned with bwa sampe?
aquinom85 is offline   Reply With Quote
Old 03-08-2012, 05:21 PM   #7
ben.weisburd
Junior Member
 
Location: San Francisco

Join Date: Oct 2010
Posts: 9
Default

Yeah I was using bwa sampe on whole exome samples. Not sure about CASAVA and soft clipping.
ben.weisburd is offline   Reply With Quote
Old 03-13-2012, 12:40 PM   #8
aquinom85
Research Bioinformaticist
 
Location: Boston

Join Date: Dec 2011
Posts: 19
Default

Is there a readme that explains how to read the alignment created by bam2html? I couldn't find one on their website or within the README that comes with the software.
aquinom85 is offline   Reply With Quote
Old 04-10-2012, 10:02 PM   #9
ben.weisburd
Junior Member
 
Location: San Francisco

Join Date: Oct 2010
Posts: 9
Default

Has anyone written scripts to convert CREST output to a format that can be visualized in Circos or another SV viewer?
What tools are people using to visualize the SVs?
Thanks
-Ben

Last edited by ben.weisburd; 04-11-2012 at 01:51 AM.
ben.weisburd is offline   Reply With Quote
Old 05-18-2012, 03:37 PM   #10
bw.
Member
 
Location: San Francisco, CA

Join Date: Mar 2012
Posts: 21
Default

I'm running CREST on exome seq samples.
On 8 out of the approx. 50 samples, the tool hangs after partially running through the step where it prints:

Output is in /tmp/486391.1.all.q/6IGDfMyBGM/tiw652Qy6Q.fa.clip.fa.psl
21 38520525 - 1
Output is in /tmp/486391.1.all.q/6IGDfMyBGM/oZqu2DckND.fa.clip.fa.psl
GL000211.1 156726 + 1
Output is in /tmp/486391.1.all.q/6IGDfMyBGM/m9Sp2Shk_H.fa.clip.fa.psl
7 2768145 - 3
Output is in /tmp/486391.1.all.q/6IGDfMyBGM/_eFzYi5xLz.fa.cap.contigs.clip.fa.psl

...

After that it doesn't produce any more output and doesn't terminate.
It doesn't ever get to the part where it says "SV filter starting...."

This makes the tool unusable for these samples, and I can't see what differentiates these samples from my other samples where CREST completes normally and outputs the table of structural variants.
bw. is offline   Reply With Quote
Reply

Tags
cancer, structural variations, wgs

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:05 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO