SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Numerous short read aligners: But which is best? JackieBadger Bioinformatics 4 04-28-2011 11:02 AM
Tophat/cufflinks; a way to assign read to a gene poisson200 Bioinformatics 2 07-27-2010 01:09 AM
Maize Endopserm RNAseq: Cufflinks Siva General 2 03-09-2010 05:41 PM
Running MAQ SNP/Indel detection/Assembly Tools on short aligners zee Bioinformatics 4 12-11-2009 01:41 PM
short reads missed by aligners bioinfosm Bioinformatics 10 08-08-2008 02:50 AM

Reply
 
Thread Tools
Old 12-17-2009, 08:09 AM   #1
sjm
Member
 
Location: St Louis, MO

Join Date: Nov 2009
Posts: 27
Default Tophat/Cufflinks/RNASeq short-read aligners and pseudogenes

I've had good success with Tophat and Cufflinks for analysis of RNASeq and computation of expression changes between samples - thanks to all the authors! I'm pretty new to this game (<6 months experience with NGS, aligners and UNIX) but these have been helpful programs.

I have an unresolved issue, though; how do these algorithms assign sequence reads/bundles which map to identical areas of the genome? (mouse genome in this case.) This may be a question that is applicable to all similar alignment software. As an example, the following transcripts (EMBL nomenclature):

ENSMUST00000022634 (chromosome 14) - the 'authentic' gene transcript
ENSMUST00000101577 (chromosome 12) - a pseudogene, whose transcript may or may not be translated

are ~1 kb transcripts that are virtually identical (just a few mismatched nucleotides between them). The cufflinks transcripts.gtf, .tmap and .tracking files show that different RPKM values are being generated for each of these two transcripts, and the CUFF.xxx bundles assigned to them are numbered differently.

I'm curious as to whether sequence reads/bundles which are identical are being randomly assigned as the chr14 or the chr12 transcript; or whether they are being assigned to both loci?

Thanks for any insight.
sjm is offline   Reply With Quote
Old 01-20-2010, 12:09 AM   #2
lmf_bill
Member
 
Location: New Haven

Join Date: Jul 2008
Posts: 36
Default

I think your question is relative to multiple hits of one read. It one hard work. Some recent papers published in Bioinformatics have put forward some approaches to resolve this problem, you can check it.
lmf_bill is offline   Reply With Quote
Old 01-20-2010, 05:09 AM   #3
sjm
Member
 
Location: St Louis, MO

Join Date: Nov 2009
Posts: 27
Default

Thanks for the advice - I'll check recent Bioinformatics issues.

I would still be interested to hear from anyone who has insight into how Tophat/Cufflinks (specifically) deals with this problem... although I realize only the coders might know this.
sjm is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:07 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO