SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Database of tumor-normal reads desmo Bioinformatics 2 03-27-2014 10:21 AM
UnifiedGenotyper between tumor and normal kasthuri Bioinformatics 3 07-21-2012 08:47 AM
Somatic mutation profiling from 454 tumor-normal paired data pravee1216 Bioinformatics 0 05-12-2012 03:38 AM
GATK excludes some samples for cohort variant calling liu_xt005 Bioinformatics 2 02-01-2012 11:58 AM
Paired-sample (tumor/normal) somatic mutation detection software alexischr Bioinformatics 1 04-14-2011 04:56 AM

Reply
 
Thread Tools
Old 06-12-2012, 08:18 AM   #1
ctsa
Junior Member
 
Location: Southern California

Join Date: Jan 2011
Posts: 6
Default Strelka: Somatic small-variant calling workflow for matched tumor-normal samples

Hello All,

Strelka is a new workflow available to call SNVs and small indels from sequencing data for matched tumor-normal samples. It is designed to detect somatic variants at lower frequencies typically encountered in tumors due to sample impurity or sub-clone variation. The workflow also provides computational efficiency appropriate for the whole genome sequencing case: requiring ~1 core-hour per 2x combined tumor normal coverage.

More information/source code available here:

https://sites.google.com/site/strelk...cvariantcaller

We appreciate any feedback on how these methods can be improved.

Best Regards,

-Chris Saunders
ctsa is offline   Reply With Quote
Old 06-12-2012, 04:08 PM   #2
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Looks like the license is fairly restrictive. Any chance of moving this to an open source license?
nilshomer is offline   Reply With Quote
Old 06-13-2012, 01:35 AM   #3
Jane M
Senior Member
 
Location: Paris

Join Date: Aug 2011
Posts: 237
Default

Hello,

I could be interested in the tool that you suggest since it deals with my problematic, but as said nilshomer, there seems to be restrictions with the license, I am not able to download the sources...

Jane
Jane M is offline   Reply With Quote
Old 06-13-2012, 08:35 AM   #4
ctsa
Junior Member
 
Location: Southern California

Join Date: Jan 2011
Posts: 6
Default

Hi Nils and Jane --

Thanks for highlighting this issue, I will take a look today to see what our options are wrt the source license.

-Chris
ctsa is offline   Reply With Quote
Old 06-15-2012, 07:32 AM   #5
ctsa
Junior Member
 
Location: Southern California

Join Date: Jan 2011
Posts: 6
Default

I've gotten additional feedback about the source download link on some web browsers. The source download URL is:

ftp://[email protected]

Note that no password is required. In firefox it looks like a password prompt comes up anyway -- you can leave the password field blank and just hit "Ok" to enter the ftp site.
ctsa is offline   Reply With Quote
Old 06-15-2012, 07:51 AM   #6
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by ctsa View Post
I've gotten additional feedback about the source download link on some web browsers. The source download URL is:

ftp://[email protected]

Note that no password is required. In firefox it looks like a password prompt comes up anyway -- you can leave the password field blank and just hit "Ok" to enter the ftp site.
I am able to download the source code fine, but it is the license to which I do not agree.
nilshomer is offline   Reply With Quote
Old 06-15-2012, 08:01 AM   #7
ctsa
Junior Member
 
Location: Southern California

Join Date: Jan 2011
Posts: 6
Default

Hi Nils --

Sorry if there's a misunderstanding, the additional ftp advice is in response to a separate conversation. As I replied above, I'm working on the license issue.

-Chris
ctsa is offline   Reply With Quote
Old 06-15-2012, 08:04 AM   #8
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Sorry, thanks for looking into it!
nilshomer is offline   Reply With Quote
Old 07-18-2012, 09:48 PM   #9
pravee1216
Member
 
Location: India

Join Date: Aug 2010
Posts: 35
Default platform independent?

Good news!!

Couple of questions:

a) Does Strelka support alignment files from Roche 454/FLX sequencing reads? or Is it designed mainly for Illumina data?

b) How does it handle calls at homopolymer regions, especially for 454/MiSeq/IoT platform data? Is this tested?

Thanks in advance

Raj
pravee1216 is offline   Reply With Quote
Old 07-30-2012, 01:06 AM   #10
genomicist
Member
 
Location: Sweden

Join Date: Jan 2011
Posts: 12
Default Why are there two "format" fields in the output?

I wonder why there are two "format" fields in the output (the last two columns of the output file) of this type: DP:FDP:SDP:SUBDP:AU:CU:GU:TU. Have been looking for explanation but all in vane.
genomicist is offline   Reply With Quote
Old 07-30-2012, 01:13 AM   #11
genomicist
Member
 
Location: Sweden

Join Date: Jan 2011
Posts: 12
Default

I also wonder what are the optional "extraStrelkaArguments" that are possible to specify in the configuration file. Is there a list?

Specifically, is it possible to filter the calls on variant allele frequency? My "passed" SNV list contains lots of calls that are supported by 1 or 2 reads with the alternative base, along with a couple of hundred reads with reference base. These are presumably sequencing errors.
genomicist is offline   Reply With Quote
Old 09-26-2012, 12:40 PM   #12
lethalfang
Member
 
Location: San Francisco, CA

Join Date: Aug 2011
Posts: 90
Default

I'm trying to use Strelka on some sequencing data we got from Solid 5500, with its BAM file aligned with LifeScope 2.5.
The LifeScope-produced BAM file seems to be incompatible with Stralka.
Does anyone know of a way to convert the BAM into something acceptable by Strelka?

Thanks in advance.

Well, it seems I was just missing an index .bam.bai file, which I created using samtools index aln.bam.
It is running now. Let's see how it goes.

Last edited by lethalfang; 09-26-2012 at 01:13 PM. Reason: Problem may be solved.
lethalfang is offline   Reply With Quote
Old 01-09-2013, 08:07 AM   #13
malachig
Senior Member
 
Location: WashU

Join Date: Aug 2010
Posts: 117
Default

It looks like nilshomer's initial question was never addressed. We have the same concern with using this software:

"Looks like the license is fairly restrictive. Any chance of moving this to an open source license?"

If this could be moved to open source, that would make it easier to deploy in pipelines/platforms that are themselves open source projects...
malachig is offline   Reply With Quote
Old 05-07-2013, 07:54 AM   #14
ctsa
Junior Member
 
Location: Southern California

Join Date: Jan 2011
Posts: 6
Default

Looks like I'm not getting emails for this thread. I'll try to briefly cover the existing questions but encourage you to re-post any current issue to the strelka mailing list here:

https://groups.google.com/forum/#!forum/strelka-discuss

- License:

Strelka has recently been moved to the Illumina Open Source Software License (v1). Details are on github here:

http://cloud.github.com/downloads/se...te_1_Final.pdf


- Incompatible BAMs:

All known BAM restrictions are described in the FAQ here:

https://sites.google.com/site/strelk...-the-workflow-


- Format Fields:

All format fields are described in the VCF header, as well as on the website here:

https://sites.google.com/site/strelk...variant-output
ctsa is offline   Reply With Quote
Old 12-12-2014, 06:06 AM   #15
Jane M
Senior Member
 
Location: Paris

Join Date: Aug 2011
Posts: 237
Default

Hello,

I am using Strelka 1.0.14 on WGS data for a few days.
Strelka ran without problem on my samples.
To further filtered my list of variants, I need information about the coverage of reference and "variant" alleles in both normal and tumor samples.

As I read in the strelka discussion list, the number of reads supporting the "indel"(=variant) allele is given by the TIR column (its first field preferentially).
My problem is to find the number of reads supporting the reference allele.
Maybe I missed something, but I tried DP, TAR, DP-TOR, TAR+TOR,... without success....
In some cases, DP seem ok. In other cases, TAR+TOR look ok.
This information is crucial for my analysis and I am stuck here with this detail...

Could you please tell me how to compute precisely the number of reads supporting the reference allele?

Thank you in advance,
Jane

ps: I just posted this question on the strelka mailing list, but it could be more visible here
Jane M is offline   Reply With Quote
Old 12-15-2014, 01:38 AM   #16
Jane M
Senior Member
 
Location: Paris

Join Date: Aug 2011
Posts: 237
Default

For those who are interested, there is an answer here: https://groups.google.com/forum/#!to...ss/g_Muy5wVjbY
Jane M is offline   Reply With Quote
Reply

Tags
cancer genomics, cancer ngs, indel calling, variant calling

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:26 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO