Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SNP calling and allele specific expression by RNAseq

    Any solid and reliable pipeline for this application?
    I would like to use Samtools pile-up and some stat steps, but would like to have the helps for a reliable pipeline established well. Any help is highly appreciated!

    P.S. 1) total switch of SNP expression is also interested;
    2) have any experience to share for mouse transcriptome?
    3) what is the minimal numbers for reads? 30Million or 40Millions?
    4) must remove potential PCR duplicates?
    Thanks a lot!

  • #2
    I don't know of any established pipeline for calling SNPs from RNA-seq data. GATK claims that it can do it (even though it is designed for genomic data), but for me it results in a lot of false SNPs. However, I am also dealing with another complicating factor of not having a reference genome and doing a de novo transcriptome assembly to generate the reference to which to map and discover variants--so the results might be better for you. If anyone knows of software that is better for SNP discovery in transcriptome data, I would be happy to learn about it.

    Comment


    • #3
      HugeSEQ: a good pipeline

      Maybe my tread is silly, however, I did not get anybody helping me out. I can easily ran Samtools mpileup or GATK, but I really want to get something information as for the parameter setting-up and statistics, so far got no comments or suggestions.
      A recent paper in NBT combining and comparing both pipelines of samtools and GATK, and they got a combined pipeline called HugeSeq, pretty cool! They also ran some data to compare, very helpful. I could not run it because it is not suitable for MAC OS in my hands and I got some ideas from reading their scripts, and my experience maybe helpful for other people here.

      Comment


      • #4
        The Passion RNA-seq pipeline is able to call SNP and compute allele specific expression.

        Comment


        • #5
          Forgive my destructive criticism, but HugeSeq seems designed for genomic SNPs, while Passion is a splice site detector.
          There seems to be tools out there specific for Allele-Specific expression, like Alleleseq

          I'm gonna test this in the next few days, I will keep you updated.
          Last edited by giorgifm; 05-09-2012, 06:09 AM. Reason: typo

          Comment


          • #6
            Do anybody got any software or tools suggested for SNP-calling from Illumina pair-end read, RNA-seq, human reference transcriptome is available.

            I got test with few different mapper and run through mpileup.
            It seems like each mapper given different SNP-calling result
            Thanks for any advice to solve my doubt.

            Comment


            • #7
              Dear edge,

              we use Varscan in pileup2snp mode for your issue. It's a good approach, as the output can then be parsed for allele-specific counting. However, the software doesn't immediately provide information on the SNP phase, which needs to be inferred previously, somehow.

              Comment


              • #8
                How did this go? I am looking into something to do allelic specific expression, specifically looking for nonsense mediated decay. Alleleseq looks promising. One problem is that I am working with Zebrafish and so the 1000 genomes SNP database is useless to me.

                Originally posted by giorgifm View Post
                Forgive my destructive criticism, but HugeSeq seems designed for genomic SNPs, while Passion is a splice site detector.
                There seems to be tools out there specific for Allele-Specific expression, like Alleleseq

                I'm gonna test this in the next few days, I will keep you updated.

                Comment


                • #9
                  Hi giorgifm,

                  Do you have any idea to quality control of SNP result?
                  Currently I'm using bowtie, bwa and gsnap to generate sam output and then use samtool mpileup to generate SNP result.

                  However, all generate totally different result.
                  It make me worry and not sure how to determine which is the best SNP result

                  Thanks for any advice.
                  I just wanna to get high quality SNP calling result of my transcriptome data set.

                  Comment


                  • #10
                    Hi,

                    I'd also like to hear how it went.
                    To me it seems like a pipeline that's a bit tricky to customize...
                    A few tips and tricks would surely help!


                    Originally posted by gumbos View Post
                    How did this go? I am looking into something to do allelic specific expression, specifically looking for nonsense mediated decay. Alleleseq looks promising. One problem is that I am working with Zebrafish and so the 1000 genomes SNP database is useless to me.

                    Comment


                    • #11
                      Hi giorgifm,

                      Do you mind to share more regarding how you calling SNP by using VarScan?
                      If can, hope you can explain on it more detail.
                      Thanks.

                      Comment


                      • #12
                        Dear edge,

                        sure. The pileup2snp function of varscan generates calls which are effectively counting alleles in each position (given some parameters such as minimum coverage, minum minor allele frequency, minimum minor allele count etc) http://varscan.sourceforge.net/using...2.3_pileup2snp

                        These counts can and must be parsed in an effective way, inferring the phase of the snps (in order to count them together). Phasing can be done in a mendelian way if you have both parents' sequences, or if you have the two haplotypes.

                        Then as said in the beginning of this post, a simple binomial test can be calculated. There are however several sample-specific issues which you must take into account. For example, snp calls at the UTR level may or may not be ignored, being UTRs less covered than coding regions in some samples.

                        Comment


                        • #13
                          Hi giorgifm,

                          Thanks for your advice.

                          Below is the criteria of my data set and problem facing right now:
                          I have 2 sets of Illumina pair-end RNA-seq data which shown as below:
                          1. normal tissue culture of Plant A;
                          2. infected tissue culture of Plant A (caused by severe disease);

                          Do I need to remove duplicate or do any realignment of Indel on my transcriptome data set before SNP calling?

                          How you preparing the input file for Varscan?
                          Below is what I think (but I not sure is proper or correct method or not):
                          1. map the pair-end data back to reference transcriptome by aligner program (such as bowtie, bwa, etc);
                          2. using samtools mpileup to call SNP;
                          3. running Varscan somatic;

                          My main purpose of doing transcriptome SNP-calling was identifying how the severe disease affect the Plant's tissue.
                          I not so sure whether SNP calling of transcriptome data set will provide any hints for me or not.

                          As I know, Genome SNP calling will undergo remove PCR duplicates and local realignment around indel before SNP calling.
                          Varscan work fine for genome and transcriptome.

                          I just not so sure what is the way that I can try in order to archive my aim purpose (based on SNP calling result link of the affects of disease in PlantA).
                          Thanks for advice.
                          Last edited by edge; 11-25-2012, 11:53 PM.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM
                          • seqadmin
                            Techniques and Challenges in Conservation Genomics
                            by seqadmin



                            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                            Avian Conservation
                            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                            03-08-2024, 10:41 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Yesterday, 06:37 PM
                          0 responses
                          8 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, Yesterday, 06:07 PM
                          0 responses
                          8 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-22-2024, 10:03 AM
                          0 responses
                          49 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-21-2024, 07:32 AM
                          0 responses
                          67 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X