SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNAseq: Pipeline to detect allele specific expression dariober Bioinformatics 9 07-17-2015 12:46 PM
Quantifying allele-specific expression imbalance using NGS jastop Bioinformatics 2 08-29-2012 07:44 AM
allele-specific expression baohua100 Bioinformatics 0 05-10-2011 11:32 PM
Test of allele-specific expression by genome and transcritome sequencing baohua100 RNA Sequencing 1 05-02-2011 06:00 AM
PubMed: Allele-specific expression assays using Solexa. Newsbot! Literature Watch 0 09-11-2009 02:01 AM

Reply
 
Thread Tools
Old 02-20-2012, 03:41 PM   #1
lewewoo
Member
 
Location: Moon

Join Date: Apr 2011
Posts: 60
Default SNP calling and allele specific expression by RNAseq

Any solid and reliable pipeline for this application?
I would like to use Samtools pile-up and some stat steps, but would like to have the helps for a reliable pipeline established well. Any help is highly appreciated!

P.S. 1) total switch of SNP expression is also interested;
2) have any experience to share for mouse transcriptome?
3) what is the minimal numbers for reads? 30Million or 40Millions?
4) must remove potential PCR duplicates?
Thanks a lot!
lewewoo is offline   Reply With Quote
Old 03-26-2012, 01:52 PM   #2
shoegame2001
Member
 
Location: California

Join Date: Dec 2010
Posts: 21
Default

I don't know of any established pipeline for calling SNPs from RNA-seq data. GATK claims that it can do it (even though it is designed for genomic data), but for me it results in a lot of false SNPs. However, I am also dealing with another complicating factor of not having a reference genome and doing a de novo transcriptome assembly to generate the reference to which to map and discover variants--so the results might be better for you. If anyone knows of software that is better for SNP discovery in transcriptome data, I would be happy to learn about it.
shoegame2001 is offline   Reply With Quote
Old 03-26-2012, 02:15 PM   #3
lewewoo
Member
 
Location: Moon

Join Date: Apr 2011
Posts: 60
Default HugeSEQ: a good pipeline

Maybe my tread is silly, however, I did not get anybody helping me out. I can easily ran Samtools mpileup or GATK, but I really want to get something information as for the parameter setting-up and statistics, so far got no comments or suggestions.
A recent paper in NBT combining and comparing both pipelines of samtools and GATK, and they got a combined pipeline called HugeSeq, pretty cool! They also ran some data to compare, very helpful. I could not run it because it is not suitable for MAC OS in my hands and I got some ideas from reading their scripts, and my experience maybe helpful for other people here.
http://hugeseq.hugolam.com/
lewewoo is offline   Reply With Quote
Old 03-27-2012, 05:09 AM   #4
KaiYe
Senior Member
 
Location: amsterdam

Join Date: Jun 2009
Posts: 133
Default

The Passion RNA-seq pipeline is able to call SNP and compute allele specific expression.
https://trac.nbic.nl/passion/
KaiYe is offline   Reply With Quote
Old 05-09-2012, 06:08 AM   #5
giorgifm
Member
 
Location: Columbia University Medical Center

Join Date: Aug 2011
Posts: 35
Default

Forgive my destructive criticism, but HugeSeq seems designed for genomic SNPs, while Passion is a splice site detector.
There seems to be tools out there specific for Allele-Specific expression, like Alleleseq

I'm gonna test this in the next few days, I will keep you updated.

Last edited by giorgifm; 05-09-2012 at 06:09 AM. Reason: typo
giorgifm is offline   Reply With Quote
Old 05-30-2012, 09:12 AM   #6
edge
Senior Member
 
Location: China

Join Date: Sep 2009
Posts: 199
Default

Do anybody got any software or tools suggested for SNP-calling from Illumina pair-end read, RNA-seq, human reference transcriptome is available.

I got test with few different mapper and run through mpileup.
It seems like each mapper given different SNP-calling result
Thanks for any advice to solve my doubt.
edge is offline   Reply With Quote
Old 05-31-2012, 06:55 AM   #7
giorgifm
Member
 
Location: Columbia University Medical Center

Join Date: Aug 2011
Posts: 35
Default

Dear edge,

we use Varscan in pileup2snp mode for your issue. It's a good approach, as the output can then be parsed for allele-specific counting. However, the software doesn't immediately provide information on the SNP phase, which needs to be inferred previously, somehow.
giorgifm is offline   Reply With Quote
Old 06-08-2012, 09:58 AM   #8
gumbos
Junior Member
 
Location: San Francisco

Join Date: Feb 2011
Posts: 6
Default

How did this go? I am looking into something to do allelic specific expression, specifically looking for nonsense mediated decay. Alleleseq looks promising. One problem is that I am working with Zebrafish and so the 1000 genomes SNP database is useless to me.

Quote:
Originally Posted by giorgifm View Post
Forgive my destructive criticism, but HugeSeq seems designed for genomic SNPs, while Passion is a splice site detector.
There seems to be tools out there specific for Allele-Specific expression, like Alleleseq

I'm gonna test this in the next few days, I will keep you updated.
gumbos is offline   Reply With Quote
Old 06-09-2012, 05:03 AM   #9
edge
Senior Member
 
Location: China

Join Date: Sep 2009
Posts: 199
Default

Hi giorgifm,

Do you have any idea to quality control of SNP result?
Currently I'm using bowtie, bwa and gsnap to generate sam output and then use samtool mpileup to generate SNP result.

However, all generate totally different result.
It make me worry and not sure how to determine which is the best SNP result

Thanks for any advice.
I just wanna to get high quality SNP calling result of my transcriptome data set.
edge is offline   Reply With Quote
Old 06-11-2012, 06:19 AM   #10
nika
Junior Member
 
Location: Spain

Join Date: Jun 2012
Posts: 2
Default

Hi,

I'd also like to hear how it went.
To me it seems like a pipeline that's a bit tricky to customize...
A few tips and tricks would surely help!


Quote:
Originally Posted by gumbos View Post
How did this go? I am looking into something to do allelic specific expression, specifically looking for nonsense mediated decay. Alleleseq looks promising. One problem is that I am working with Zebrafish and so the 1000 genomes SNP database is useless to me.
nika is offline   Reply With Quote
Old 11-25-2012, 10:25 PM   #11
edge
Senior Member
 
Location: China

Join Date: Sep 2009
Posts: 199
Default

Hi giorgifm,

Do you mind to share more regarding how you calling SNP by using VarScan?
If can, hope you can explain on it more detail.
Thanks.
edge is offline   Reply With Quote
Old 11-25-2012, 10:37 PM   #12
giorgifm
Member
 
Location: Columbia University Medical Center

Join Date: Aug 2011
Posts: 35
Default

Dear edge,

sure. The pileup2snp function of varscan generates calls which are effectively counting alleles in each position (given some parameters such as minimum coverage, minum minor allele frequency, minimum minor allele count etc) http://varscan.sourceforge.net/using...2.3_pileup2snp

These counts can and must be parsed in an effective way, inferring the phase of the snps (in order to count them together). Phasing can be done in a mendelian way if you have both parents' sequences, or if you have the two haplotypes.

Then as said in the beginning of this post, a simple binomial test can be calculated. There are however several sample-specific issues which you must take into account. For example, snp calls at the UTR level may or may not be ignored, being UTRs less covered than coding regions in some samples.
giorgifm is offline   Reply With Quote
Old 11-25-2012, 10:50 PM   #13
edge
Senior Member
 
Location: China

Join Date: Sep 2009
Posts: 199
Default

Hi giorgifm,

Thanks for your advice.

Below is the criteria of my data set and problem facing right now:
I have 2 sets of Illumina pair-end RNA-seq data which shown as below:
1. normal tissue culture of Plant A;
2. infected tissue culture of Plant A (caused by severe disease);

Do I need to remove duplicate or do any realignment of Indel on my transcriptome data set before SNP calling?

How you preparing the input file for Varscan?
Below is what I think (but I not sure is proper or correct method or not):
1. map the pair-end data back to reference transcriptome by aligner program (such as bowtie, bwa, etc);
2. using samtools mpileup to call SNP;
3. running Varscan somatic;

My main purpose of doing transcriptome SNP-calling was identifying how the severe disease affect the Plant's tissue.
I not so sure whether SNP calling of transcriptome data set will provide any hints for me or not.

As I know, Genome SNP calling will undergo remove PCR duplicates and local realignment around indel before SNP calling.
Varscan work fine for genome and transcriptome.

I just not so sure what is the way that I can try in order to archive my aim purpose (based on SNP calling result link of the affects of disease in PlantA).
Thanks for advice.

Last edited by edge; 11-25-2012 at 10:53 PM.
edge is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:32 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO