SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Transcript assembly in very dense genomes Vanda RNA Sequencing 1 03-10-2013 10:13 AM
Cufflinks merging more than one transcript on bacterial genomes Noa RNA Sequencing 0 01-24-2012 05:41 AM
NGS data for known bacterial genomes ojy Bioinformatics 1 11-02-2011 05:21 AM
Grab bacterial genomes from NCBI ElMichael General 0 03-09-2011 07:31 AM
Merging 2 genomes oiiio Bioinformatics 0 01-25-2011 03:16 PM

Reply
 
Thread Tools
Old 01-24-2012, 05:42 AM   #1
Noa
Member
 
Location: haifa israel

Join Date: Jun 2011
Posts: 62
Default Cufflinks merging more than one transcript on bacterial genomes

I am running tophat and cufflinks on a bacterial genome using galaxy.

As parameters for tophat, I used minimal distance between introns as 15bp, and max intron size as 1500bp. Visual verification of this looks decent. What I mean by this is that when I look at the splice junctions, not many are identified (I do not expect many introns in my genome) although there are a few false ones, that seem to connect two different genes. This is one thing I would like help with- is it worth simply reducing to nothing the max intron size? What is accepted consensus when using tophat on bacterial genomes?

When I look at the second tophat file, of accepted hits, all hits align nicely with known genes. However, when I run cufflinks I run into the following issues: when I use a reference genome, I get in addition to the known transcripts, a bunch of very long transcripts spanning very large genomic regions. Also, I will have two genes that are very near each other but run in opposite directions (which you can see beautifully in the tophat accepted hits alignments - different colors for each strand) but they merge into a single CUFF identifier. Is there any way I can address this- is it something I am missing with respect to parameters I have to change because I am working on a bacterial genome?

Many thanks

Noa
Noa is offline   Reply With Quote
Old 01-24-2012, 07:27 AM   #2
polyatail
Member
 
Location: New York, NY

Join Date: Dec 2010
Posts: 25
Default

I've never tried these tools on bacterial RNA-seq, but it's my understanding that they were designed with eukaryotes in mind. TopHat, for example, aligns reads across splice junctions which presumably are absent in prokaryotes. Cufflinks assembles multiple splice isoforms which won't be present in a prokaryote.

Without RNA-seq, finding new genes in a bacterial genome can be accomplished using the RAST or IMG/ER annotation services. Perhaps you can convert this annotation to a GTF, align your reads with bowtie, then use Cufflinks (or even a short script) to generate RPKM for each gene?

It sounds like you have a stranded bacterial RNA-seq dataset. Out of curiosity, could you elaborate a bit on how the RNA and library were prepared?
polyatail is offline   Reply With Quote
Old 01-24-2012, 12:33 PM   #3
Noa
Member
 
Location: haifa israel

Join Date: Jun 2011
Posts: 62
Default

Hi- thanks for the reply.
Maybe I should have given more of an intro: I am trying to develop a pipeline using Galaxy for non-bioinformaticists to do very basic analyses on RNA-Seq data (FPKM comparison of different experimental conditions etc). I myself am a wet biologist as well so I am trying to stay away from Linux. I know the tuxedo suite is eukaryote oriented but I was hoping to use it since Galaxy is so user-friendly. If I understand correctly, tophat aligns the reads to bowtie first in any case, and cufflinks will give me the FPKMs.

The dataset I am using is actually just a test set from a friend. In any case it is in fact stranded, and was created using polyA tailing of the bacterial genome, followed by fragmentation, then treatment by antarctic phosphatase and PNK, then ligation of adapter, and then RT and PCR. It works beautifully on eukrayotes as well in my hands.
Noa is offline   Reply With Quote
Reply

Tags
bacteria, cufflinks, galaxy, rna-seq, tophat

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:19 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO