SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to do a discontinuous MEGABLAST using megablast? maimaiti2008 Bioinformatics 4 08-28-2013 11:07 AM
confused on megablast gigigou Bioinformatics 4 07-30-2013 04:34 PM
transcriptome assembly with reference Chuckytah Bioinformatics 0 02-05-2013 06:34 AM
Megablast ***** No hits found ****** NGSnoob Bioinformatics 2 08-25-2010 11:34 PM

Reply
 
Thread Tools
Old 09-24-2014, 04:45 AM   #1
illinu
Member
 
Location: US

Join Date: Jul 2013
Posts: 55
Default Preparing reference transcriptome: to or not to megablast

A quick question here. I am constructing a reference transcriptome for DE out of 16 transcriptomes (8 control + 8 treatment) and my strategy is as follows:
1. create a database with transcriptome 1 (random)
2. blast transcriptome 2 to 1
3. start creating the reference transcriptome = transcriptome 1 + unmatched transcripts from transcriptome 2
4. create a database with the reference transcriptome
5. blast transcriptome 3 to ref transcr.
6. as in point 3 and so on

At the end I will have a reference transcriptome with all the transcripts present in the 16 transcriptomes represented once.

My questions is to use or not to use -task megablast.
If I use megablast then the reference transcriptome is almost double in size and there might be in my view some redundancy?

What are your thoughts...
Thank you
illinu is offline   Reply With Quote
Old 09-24-2014, 01:03 PM   #2
bastianwur
Member
 
Location: Germany/Netherlands

Join Date: Feb 2014
Posts: 98
Default

If your computing resources allow it: cross-assemble the transcriptome.
Means you pool all the RNAseq samples together, and assemble it as one.
That doesn't produce many mis-assemblies, and makes dealing with the reference a lot easier. (I'd give a reference here, but I think it's just submitted yet).

Else: Megablast will be too strict for that purpose, I think. I'd rather use normal blastn instead.

Last edited by bastianwur; 09-24-2014 at 01:13 PM.
bastianwur is offline   Reply With Quote
Old 09-25-2014, 12:09 AM   #3
illinu
Member
 
Location: US

Join Date: Jul 2013
Posts: 55
Default

Quote:
Else: Megablast will be too strict for that purpose, I think. I'd rather use normal blastn instead.
This is exactly what I thought, too strict. So I did blastn finally.

I didn't think about pooling all samples because the 8 control + 8 treatment are not all the same genotype but 4 different ones and among them 2 different ecotypes. Wouldn't cross assembly in this case create falsely larger transcripts? Or chimeric genes?
illinu is offline   Reply With Quote
Old 09-25-2014, 05:30 AM   #4
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

For combining datasets you could use Brian Bushell's 'dedupe.sh' program.

Since you do have different genotypes, I agree that pooling all samples and re-assembling the transcriptome would cause too many false transcripts.

But it sounds like you have already solved your problem. I would be concerned about commutative effects. In other words is starting with sample 1 and getting 1 + 2 + 3 the same as starting with sample 3 and getting 3 + 1 + 2?
westerman is offline   Reply With Quote
Old 09-25-2014, 06:51 AM   #5
illinu
Member
 
Location: US

Join Date: Jul 2013
Posts: 55
Default

Quote:
Originally Posted by westerman View Post
I would be concerned about commutative effects. In other words is starting with sample 1 and getting 1 + 2 + 3 the same as starting with sample 3 and getting 3 + 1 + 2?
Good point... I don't know. Actually I am looking at how other people do DE with independently assembled transcriptomes, no reference genome and different genotypes and the options are so wide that I am starting to be puzzled. It seems not too many researchers adopted my strategy, bah indeed I haven't found one reference. I am seeing people doing annotation of the de novo assemblies independently, then mapping their reads to the annotated transcriptomes and then comparing the lists. But this seems to give problems of having the same transcripts annotated differently.
My plan was to generate this reference transcriptome (that suits my data set) and then annotate and map independently the reads of each genotype to the reference. I think with this approach I will be able to compare side by side the same genes and not have to work with maybe the same transcript being annotated differently in two lists (or 8).

I will think about the commutative effect. How can I test this?
illinu is offline   Reply With Quote
Old 09-25-2014, 07:10 AM   #6
bastianwur
Member
 
Location: Germany/Netherlands

Join Date: Feb 2014
Posts: 98
Default

Quote:
Originally Posted by illinu View Post
This is exactly what I thought, too strict. So I did blastn finally.

I didn't think about pooling all samples because the 8 control + 8 treatment are not all the same genotype but 4 different ones and among them 2 different ecotypes. Wouldn't cross assembly in this case create falsely larger transcripts? Or chimeric genes?
AFAIK not in a considerable amount, but then again that has only been tested for prokaryotes (I'd assume higher error rate for eukaryotes), and not sure with what data you're dealing with.
bastianwur is offline   Reply With Quote
Old 09-25-2014, 07:19 AM   #7
illinu
Member
 
Location: US

Join Date: Jul 2013
Posts: 55
Default

Quote:
Originally Posted by bastianwur View Post
not sure with what data you're dealing with.
It is eukariote diploid heterozygote... there could be alternative splicing due to the treatment (?)... not sure about pooling
illinu is offline   Reply With Quote
Reply

Tags
megablast, reference transcriptome, rnaseq dge

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:16 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO