Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MATS missed a few obviously differentially spliced exons.

    Hi,

    This is a difficult question and specific to MATS, but if there is anyone familiar with MATS, I would be grateful for the help.
    I've already emailed the authors, but I haven't yet received an answer.

    I've been running MATS to identify differentially spliced exons.
    Overall, I'm satisfied with the results. I've confirmed with IGV some of the differentially spliced exons.
    However, MATS missed a few differentially spliced exons in the Macf1 gene that to me are quite evident to me in IGV.

    The annotation file used was the most recent (version 77) Ensembl GTF file for M. musculus.
    I've put in attachment an IGV screenshot, and a Sashimi plot generated with IGV, of the differentially spliced exons that MATS did not identify.

    I also grepped Macf1 in the folder ASEvent, which is supposed to contain all possible alternative splicing events, and none of the exons returned correspond to the differentially spliced exons identified with IGV.
    So, it appears MATS did not even test for these exons.
    Could this be a parsing problem of the Ensembl GTF file?
    The differentially spliced exons do appear in the GTF file, as illustrated both in the IGV screenshots and the text file in attachment with the grep results for Macf1 on the Ensembl GTF file.

    Code:
    [username@lg-1r17-n04 ASEvents]$ basename `pwd`
    ASEvents
    [username@lg-1r17-n04 ASEvents]$ grep Macf1 *
    fromGTF.A3SS.txt:1289	"ENSMUSG00000028649"	"Macf1"	4	-	123368820	123368835	123368820	123368832	123369804	123369968
    fromGTF.A5SS.txt:799	"ENSMUSG00000028649"	"Macf1"	4	-	123438466	123438666	123438472	123438666	123434699	123435195
    fromGTF.AFE.txt:5765	"ENSMUSG00000028649"	"Macf1"	4	-	123683969	123684360	123564462	123564694	123554303	123554365
    fromGTF.AFE.txt:5766	"ENSMUSG00000028649"	"Macf1"	4	-	123664465	123664752	123564462	123564694	123554303	123554365
    fromGTF.AFE.txt:5767	"ENSMUSG00000028649"	"Macf1"	4	-	123683969	123684360	123664465	123664752	123554303	123554365
    fromGTF.SE.txt:5013	"ENSMUSG00000028649"	"Macf1"	4	-	123364056	123364074	123360909	123361027	123365272	123365347
    fromGTF.SE.txt:5014	"ENSMUSG00000028649"	"Macf1"	4	-	123397093	123397420	123395912	123395997	123397747	123397907
    fromGTF.SE.txt:5015	"ENSMUSG00000028649"	"Macf1"	4	-	123354478	123354580	123351821	123351951	123355055	123355267
    fromGTF.SE.txt:5016	"ENSMUSG00000028649"	"Macf1"	4	-	123441602	123441665	123440549	123440791	123444834	123444952
    fromGTF.SE.txt:5017	"ENSMUSG00000028649"	"Macf1"	4	-	123368820	123368832	123367874	123368037	123369804	123369968
    fromGTF.SE.txt:5018	"ENSMUSG00000028649"	"Macf1"	4	-	123368820	123368835	123367874	123368037	123369804	123369968
    fromGTF.SE.txt:5019	"ENSMUSG00000028649"	"Macf1"	4	-	123386548	123386557	123385454	123385686	123387227	123387431
    Thank you for your help.
    Attached Files
    Last edited by blancha; 10-29-2014, 07:04 AM.

  • #2
    Apparently, the file resulting for grepping for Macf1 in the Ensembl (version 77) GTF file for M. musculus was too big to put in attachment.

    Here are just the first lines, the lines with one of the exons that was not identified as differentially spliced, and the last few lines.

    Code:
    4	ensembl_havana	gene	123349633	123684360	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding";
    4	ensembl_havana	transcript	123349633	123369931	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000123765"; transcript_version "2"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-003"; transcript_source "ensembl_havana"; transcript_biotype "protein_coding"; tag "cds_start_NF"; tag "mRNA_start_NF"; tss_id "TSS44415"; p_id "P29600";
    4	ensembl_havana	exon	123369805	123369931	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000123765"; transcript_version "2"; exon_number "1"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-003"; transcript_source "ensembl_havana"; transcript_biotype "protein_coding"; exon_id "ENSMUSE00001259025"; exon_version "1"; tag "cds_start_NF"; tag "mRNA_start_NF"; tss_id "TSS44415"; p_id "P29600";
    4	ensembl_havana	CDS	123369805	123369931	.	-	1	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000123765"; transcript_version "2"; exon_number "1"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-003"; transcript_source "ensembl_havana"; transcript_biotype "protein_coding"; protein_id "ENSMUSP00000119600"; protein_version "1"; tag "cds_start_NF"; tag "mRNA_start_NF"; tss_id "TSS44415"; p_id "P29600";
    4	ensembl_havana	exon	123367875	123368037	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000123765"; transcript_version "2"; exon_number "2"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-003"; transcript_source "ensembl_havana"; transcript_biotype "protein_coding"; exon_id "ENSMUSE00001262976"; exon_version "1"; tag "cds_start_NF"; tag "mRNA_start_NF"; tss_id "TSS44415"; p_id "P29600";
    4	ensembl_havana	CDS	123367875	123368037	.	-	0	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000123765"; transcript_version "2"; exon_number "2"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-003"; transcript_source "ensembl_havana"; transcript_biotype "protein_coding"; protein_id "ENSMUSP00000119600"; protein_version "1"; tag "cds_start_NF"; tag "mRNA_start_NF"; tss_id "TSS44415"; p_id "P29600";
    ...
    4	ensembl	exon	123480187	123480322	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000097897"; transcript_version "5"; exon_number "35"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-202"; transcript_source "ensembl"; transcript_biotype "protein_coding"; tag "CCDS"; ccds_id "CCDS57295"; exon_id "ENSMUSE00001046125"; exon_version "1"; tss_id "TSS44416"; p_id "P29601";
    4	ensembl	CDS	123480187	123480322	.	-	1	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000097897"; transcript_version "5"; exon_number "35"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-202"; transcript_source "ensembl"; transcript_biotype "protein_coding"; tag "CCDS"; ccds_id "CCDS57295"; protein_id "ENSMUSP00000095507"; protein_version "4"; tss_id "TSS44416"; p_id "P29601";
    4	ensembl	exon	123471007	123476337	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000097897"; transcript_version "5"; exon_number "36"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-202"; transcript_source "ensembl"; transcript_biotype "protein_coding"; tag "CCDS"; ccds_id "CCDS57295"; exon_id "ENSMUSE00000599570"; exon_version "2"; tss_id "TSS44416"; p_id "P29601";
    4	ensembl	CDS	123471007	123476337	.	-	0	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000097897"; transcript_version "5"; exon_number "36"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-202"; transcript_source "ensembl"; transcript_biotype "protein_coding"; tag "CCDS"; ccds_id "CCDS57295"; protein_id "ENSMUSP00000095507"; protein_version "4"; tss_id "TSS44416"; p_id "P29601";
    …
    
    4	havana	exon	123457787	123458011	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000147030"; transcript_version "1"; exon_number "38"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-005"; transcript_source "havana"; transcript_biotype "protein_coding"; exon_id "ENSMUSE00001013689"; exon_version "1"; tag "cds_end_NF"; tag "mRNA_end_NF"; tss_id "TSS44426"; p_id "P29599";
    4	havana	CDS	123457787	123458011	.	-	0	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000147030"; transcript_version "1"; exon_number "38"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-005"; transcript_source "havana"; transcript_biotype "protein_coding"; protein_id "ENSMUSP00000123246"; protein_version "1"; tag "cds_end_NF"; tag "mRNA_end_NF"; tss_id "TSS44426"; p_id "P29599";
    4	havana	exon	123456349	123456749	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000147030"; transcript_version "1"; exon_number "39"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-005"; transcript_source "havana"; transcript_biotype "protein_coding"; exon_id "ENSMUSE00000795608"; exon_version "1"; tag "cds_end_NF"; tag "mRNA_end_NF"; tss_id "TSS44426"; p_id "P29599";
    4	havana	CDS	123456349	123456749	.	-	0	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000147030"; transcript_version "1"; exon_number "39"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-005"; transcript_source "havana"; transcript_biotype "protein_coding"; protein_id "ENSMUSP00000123246"; protein_version "1"; tag "cds_end_NF"; tag "mRNA_end_NF"; tss_id "TSS44426"; p_id "P29599";
    4	havana	UTR	123664560	123664752	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000147030"; transcript_version "1"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-005"; transcript_source "havana"; transcript_biotype "protein_coding"; tag "cds_end_NF"; tag "mRNA_end_NF"; tss_id "TSS44426"; p_id "P29599";
    4	ensembl_havana	transcript	123538880	123564694	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000146000"; transcript_version "1"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-006"; transcript_source "ensembl_havana"; transcript_biotype "processed_transcript"; tss_id "TSS44427";
    4	ensembl_havana	exon	123564463	123564694	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000146000"; transcript_version "1"; exon_number "1"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-006"; transcript_source "ensembl_havana"; transcript_biotype "processed_transcript"; exon_id "ENSMUSE00000778819"; exon_version "1"; tss_id "TSS44427";
    4	ensembl_havana	exon	123554304	123554365	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000146000"; transcript_version "1"; exon_number "2"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-006"; transcript_source "ensembl_havana"; transcript_biotype "processed_transcript"; exon_id "ENSMUSE00001304267"; exon_version "1"; tss_id "TSS44427";
    4	ensembl_havana	exon	123544742	123544831	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000146000"; transcript_version "1"; exon_number "3"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-006"; transcript_source "ensembl_havana"; transcript_biotype "processed_transcript"; exon_id "ENSMUSE00001212477"; exon_version "1"; tss_id "TSS44427";
    4	ensembl_havana	exon	123538880	123539873	.	-	.	gene_id "ENSMUSG00000028649"; gene_version "12"; transcript_id "ENSMUST00000146000"; transcript_version "1"; exon_number "4"; gene_name "Macf1"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "Macf1-006"; transcript_source "ensembl_havana"; transcript_biotype "processed_transcript"; exon_id "ENSMUSE00000800532"; exon_version "1"; tss_id "TSS44427";
    The question is probably far too specific for anyone on this forum to answer, but this is a Hail Mary.

    I normally use DEXSeq, but there are no replicates in this experiment.
    I tried MISO but I had a very hard time getting the correct GFF format from the latest (version 77) Ensembl GTF file so I gave up.

    MATS is simple to run, and the results I verified are good, except for this annoying hiccup.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      04-22-2024, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Yesterday, 08:47 AM
    0 responses
    12 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    60 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    59 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    54 views
    0 likes
    Last Post seqadmin  
    Working...
    X