Go Back   SEQanswers > Applications Forums > RNA Sequencing

Similar Threads
Thread Thread Starter Forum Replies Last Post
Transcript mapping without a reference genome gringer Bioinformatics 3 09-02-2020 11:07 PM
cuffcompare <outprefix>.stats question arrchi Bioinformatics 0 06-16-2011 05:48 AM
bowtie mapping stats mapper Bioinformatics 1 11-29-2010 06:22 AM
how does cuffcompare choose which transcript to put in combined.gtf file? d f Bioinformatics 0 11-09-2010 11:30 AM
normalizing RNA-seq data to "unique transcript length" instead of "transcript length" lmc Bioinformatics 2 06-23-2010 10:45 AM

Thread Tools
Old 09-28-2011, 03:18 PM   #1
Junior Member
Location: Michigan

Join Date: Mar 2011
Posts: 7
Default Transcript mapping stats from Cuffcompare - Does this look right to you?

Hi all,

I am working on some RNAseq data (Single end reads,36 bp from an Illumina instrument) from a prostrate cancer cell line. All I have for this is a Fasta file of all the reads.

I have assembled the reads using Tophat and Cufflinks, and then ran Cuffcompare to look at the quality of transcriptome reconstruction. This was the profile of transfrags I got.

HTML Code:
Category     No.of transfrags    % of total
Match	          1533	               1.73
Novel	          3561	               4.02
Contained	  24080	               27.18
Repeat	           0	                0
Intronic	  10115	               11.42
Polymerase        1889	               2.13
Intergenic	  28752	               32.46
Overlap on        14340	               16.19
Total	          88580	               100
[I just grep'ed the tmap file to find no of rows with each class code]

I am new to RNAseq data, so I have no idea what to expect. But I find it surprising to see that only 1.73% of the total transfrags matched to a known transcript. And that over 32% mapped to intergenic regions. Even accounting for the fact that it is a cancer cell line and some amount of changes are to be expected.

I was hoping someone with experience could take a look at this and give their opinion. Are these kind of numbers common..? Or does this mean the data I got has some problems?

Also, in general.. are there any standard quality assurance steps I can use to check RNAseq data?

Would greatly appreciate any help that I can get on this..

avi is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 06:34 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO