Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • StringTie - isoforms not overlapping

    Hi,
    I am using StringTie for transcriptome reconstruction and identification of new isoforms.
    While I was exploring the "[file]-transcripts.gtf" output file from stringtie I found something that intrigued me... in the example below I show three isoforms resulting from the same gene ("STRG.14686") and the last one was present in the reference annotation. However the start and end coordinates do not match. The first two isoforms end at 394180 bp and 389422 bp, respectively, while the third starts at 396257 bp...


    scaffold_96 StringTie transcript 383404 394180 1000 + .gene_id "STRG.14686"; transcript_id "STRG.14686.1"; cov "72.829010"; FPKM "6.506798"; TPM "8.058224";
    scaffold_96 StringTie transcript 383404 389422 1000 + .gene_id "STRG.14686"; transcript_id "STRG.14686.2"; cov "61.675678"; FPKM "5.510321"; TPM "6.824155";
    scaffold_96 StringTie transcript 396257 398001 1000 + .gene_id "STRG.14686"; transcript_id "STRG.14686.3"; reference_id "scaffold_96.g39603.t1"; ref_gene_id "scaffold_96.g39603"; cov "2963.938721"; FPKM "264.808624"; TPM "327.947357";
    Why is StringTie "clustering" these isoforms in the same gene?
    Last edited by pbarros; 04-26-2017, 03:22 AM.

  • #2
    first of all let me say that I agree with your interpretation. the third transcript does not overlap the first two and it has given all three the same 'gene_id' value.

    the only thing that comes to mind is that the assembly from stringtie, or cufflinks for that matter, is an attempted explanation, and often a simplification, of the alignment data. you may learn more about this by looking at the alignments in an alignment browser such as UCSC or IGV.

    while i doubt this, stringtie could be very, very smart and have assembled the third isoform from reads that multimapped between it and the first two transcripts which would imply they are all the same gene but one that is repeated in more than one position.
    /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
    Salk Institute for Biological Studies, La Jolla, CA, USA */

    Comment


    • #3
      thank you for the input sdriscoll ... maybe I was overthinking this

      cheers,
      pedro

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      59 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      57 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      53 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      56 views
      0 likes
      Last Post seqadmin  
      Working...
      X