Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SNP calling from RNAseq data

    Hello NGS fellows,

    I am a newbie here and would highly appreciate your advice about one particular experimental design.

    We have data from RNAseq experiment which was originally designed to assess differential expression. The details of experiment are as follows:

    2 modalities of the phenotype

    Each phenotype is represented by 4 samples. 1 sample = 60 individuals pooled together at the stage of RNA isolation.

    Molecule – polyadenylated mRNA

    Sequencing chemistry – Illumina paired-end, read length - 2*100 bp

    My question is whether it is correct to use this RNAseq data to call for SNPs? I made previous search and found that most of people calling SNP from RNAseq use 40-1000 samples (= individuals). But they initially designed RNAseq experiment for further GWAS. I see that this analysis cannot be applied to my data (at least because in my case individual flies were pooled without barcoding – 60 flies per a sample). However, can I still call for SNPs and upload the list to database as a list of potential targets for GWAS with, for example, estimation of functional impact upon protein structure? Will they be “true” SNPs, or our experimental design makes even this step invalid?

    I found this paper https://www.ncbi.nlm.nih.gov/pubmed/27458203 where people used 2 phenotypes each represented by 2 samples what is almost like our experiment, but still have doubts.

  • #2
    I think it would be easy to find exonic SNPs that are shared by all or most of the individuals and in expressed genes. It would be difficult to say much about them - for example, even if 100% of reads indicate a SNP that does not mean it's in 100% of the individuals, and if 0% of reads indicate a SNP, that does not mean it's absent in the population. But in general, you should be able to discover approximately which SNPs are present in the population and to what extent. Using simulated data may help determine how accurate this is.

    Comment


    • #3
      Brian Bushnell, we have already found those exonic SNPs and annotated them, but, as you say, the question is the interpretation.


      Originally posted by Brian Bushnell View Post
      But in general, you should be able to discover approximately which SNPs are present in the population and to what extent. Using simulated data may help determine how accurate this is.
      But even using simulation to confirm the accuracy, could the data be published in database or they will be rejected as not reliable?

      Comment


      • #4
        Pooling is always problematic, whether simulation is done before or after or not.

        I always try to dissuade experimentalists from pooling. There are so many biases in the data anyway. Pools are rarely if ever clean - i.e. derived from one phenotype - so a range of biological biases are there as well. Also, expression is highly divergent between individuals.

        I am no GWAS expert, but would advise against advertising this as GWAS. Perhaps a followup tests using Sanger sequencing etc of PCRs amplicons from individual (non-pooled) samples from the most important identified regions might provide clarity as to whether this is a true phenomenon or artifact of the expt design ?

        Comment


        • #5
          colindaven, thank you! It seems that my doubts had reasons.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:37 PM
          0 responses
          7 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 06:07 PM
          0 responses
          7 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          49 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          66 views
          0 likes
          Last Post seqadmin  
          Working...
          X