Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RNA-seq for Variant calling

    Hi

    I've read a few number of papers about this topic such:

    Next-generation RNA sequencing (RNA-seq) maps and analyzes transcriptomes and generates data on sequence variation in expressed genes. There are few reported studies on analysis strategies to maximize the yield of quality RNA-seq SNP data. We evaluated the performance of different SNP-calling methods following alignment to both genome and transcriptome by applying them to RNA-seq data from a HapMap lymphoblastoid cell line sample and comparing results with sequence variation data from 1000 Genomes. We determined that the best method to achieve high specificity and sensitivity, and greatest number of SNP calls, is to remove duplicate sequence reads after alignment to the genome and to call SNPs using SAMtools. The accuracy of SNP calls is dependent on sequence coverage available. In terms of specificity, 89% of RNA-seq SNPs calls were true variants where coverage is >10X. In terms of sensitivity, at >10X coverage 92% of all expected SNPs in expressed exons could be detected. Overall, the results indicate that RNA-seq SNP data are a very useful by-product of sequence-based transcriptome analysis. If RNA-seq is applied to disease tissue samples and assuming that genes carrying mutations relevant to disease biology are being expressed, a very high proportion of these mutations can be detected.


    I'm surprised about the results, for example, when comparing coding sites in WES, RNA-seq only finds about 33% of SNP. When compared with WGS the percentage rises up to 45%, but still quite low (I think).

    The specificity of these methods are not bad, the sensitivity is not so good (depends of coverage).

    I'm surprised (I'm newbie) by what I think these are poor results, I would expect that at least for expressed regions the % of SNP found to be higher. I'm also surprised by the fact that most papers conform with coverages of 10x or even 3x to try to call for a variant when by being RNA-seq coverage souldn't be big deal.

    I understand that some aligners can work with splice junctions, so this should not be a problem for variant calling with RNA-seq data.

    I don't know if anyone can give me some clues or some more info about this. I'm just wondering about this.

    Any paper where same samples are compared by using WGS, WES and RNA-seq?
    Thanks
    Last edited by runnerBio88; 12-29-2015, 07:37 AM.

  • #2
    Have you looked into things like allele specific expression which could have an effect on number of common SNPs found between DNA and RNA Seq? I'm not sure if that alone could contribute to the entire divergence but its worth investigating.

    Comment


    • #3
      RNA-Seq is less sensitive for variant detection because transcript levels vary widely, whereas WGS and (for the most part) WES produce relatively even coverage. Some fraction of genes will not be expressed in your sample (and therefore undetectable), while others expressed at such low levels that the read depth is insufficient. For example, to obtain 10X coverage of a 1kbp transcript present at one-in-a-million copies would require 200M 50bp reads. Most RNA-Seq datasets are an order of magnitude smaller.

      Note that there are methods for normalizing RNA-Seq libraries, but the vast majority of experiments are designed to detect differences in transcript levels (which would be obviated by normalization).
      Last edited by HESmith; 12-30-2015, 08:09 AM.

      Comment


      • #4
        Generally speaking, it is preferred using WES (whole exome seq) to identify mutations and even better if you have matched patient samples with integration of Whole Exome seq and RNA-seq to increases mutation detection performance. Detecting mutations from RNA-Seq is not a typical approach to detect mutations, mainly due to the intrinsic complexity in the transcriptome (e.g., splicing, high/low gene expression level).

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin


          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
          Today, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        37 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        41 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        35 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        54 views
        0 likes
        Last Post seqadmin  
        Working...
        X