SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNA-Seq mapping to a genome vs transcriptome? bob-loblaw Bioinformatics 0 07-30-2013 03:49 AM
Mapping to transcripts rather than to the genome in RNA Seq jhbadger RNA Sequencing 2 07-21-2012 09:03 AM

Reply
 
Thread Tools
Old 10-26-2014, 04:21 PM   #1
freestile
Member
 
Location: Chile

Join Date: Aug 2014
Posts: 11
Default Genome assembly and Rna-seq mapping

I every one, is my first time here but I follow for many months. My question is if someone have experience with genome assembly and post mapping of ran-seq data.
I use velvet, celera and clc to make de novo assembly of relative difficult bacteria, and i have relative goods statistics with velvet and celera (revised with published draft genomes). Then I check the assemblies mapping rna-seq data of the same organism and I obtain lows mapping percentages (near 50%) but when I check clc assembly this up to 90%. I try to optimise the clc assembly but if I drecrease the data or make more stringent trimming, the assembly decrease in quality.

I use 250pb PE in miseq plataform.

Best regards !
Cristian.
freestile is offline   Reply With Quote
Old 10-31-2014, 06:36 AM   #2
SylvainL
Senior Member
 
Location: Geneva

Join Date: Feb 2012
Posts: 174
Default

Hi,

for bacteria, I was using Edena http://http://www.genomic.ch/edena.php with good results

Maybe you could give a try...

Last edited by SylvainL; 10-31-2014 at 06:38 AM.
SylvainL is offline   Reply With Quote
Old 11-03-2014, 01:36 PM   #3
freestile
Member
 
Location: Chile

Join Date: Aug 2014
Posts: 11
Default

Quote:
Originally Posted by SylvainL View Post
Hi,

for bacteria, I was using Edena http://http://www.genomic.ch/edena.php with good results

Maybe you could give a try...
Thxs for reply. I will try. I obtain high rna-seq reads mapping with a5 pipeline, but i can't increase genome assembly statistics.
freestile is offline   Reply With Quote
Old 11-03-2014, 05:27 PM   #4
freestile
Member
 
Location: Chile

Join Date: Aug 2014
Posts: 11
Default

Quote:
Originally Posted by SylvainL View Post
Hi,

for bacteria, I was using Edena http://http://www.genomic.ch/edena.php with good results

Maybe you could give a try...
I try Edena but i can't get the same length of my reads, If you can help me I appreciated.
freestile is offline   Reply With Quote
Old 11-04-2014, 02:03 AM   #5
SylvainL
Senior Member
 
Location: Geneva

Join Date: Feb 2012
Posts: 174
Default

Hi,

what do you mean by not getting the same length of your reads? In my case, when the Qscores were bad at the end of the reads, I had to trimm them... You can easily do this step with fastx toolkit (fastx_trimmer in this case) or seqtk (trimfq)...

Let me know at which step you are stuck...
SylvainL is offline   Reply With Quote
Old 11-04-2014, 02:35 AM   #6
freestile
Member
 
Location: Chile

Join Date: Aug 2014
Posts: 11
Default

Quote:
Originally Posted by SylvainL View Post
Hi,

what do you mean by not getting the same length of your reads? In my case, when the Qscores were bad at the end of the reads, I had to trimm them... You can easily do this step with fastx toolkit (fastx_trimmer in this case) or seqtk (trimfq)...

Let me know at which step you are stuck...
The error is when start edena:

Rapid file(s) examination... 158 220
[err] All reads within a file must be the same length.

I make pre-processing with bbmap. Maybe the problem is paired end data?
freestile is offline   Reply With Quote
Old 11-05-2014, 04:35 PM   #7
fahmida
Member
 
Location: Australia

Join Date: Aug 2010
Posts: 54
Default

I am also having the same error "[err] All reads within a file must be the same length". I am not sure I understand the rationale behind this. Once the raw reads e.g. 100 or 150nt length are trimmed for quality, adapters etc., read length becomes variable.
fahmida is offline   Reply With Quote
Old 11-05-2014, 05:23 PM   #8
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,574
Default

For bacterial assemblies SPAdes (http://bioinf.spbau.ru/spades) should be in your list of programs to try.
GenoMax is offline   Reply With Quote
Old 11-05-2014, 06:12 PM   #9
freestile
Member
 
Location: Chile

Join Date: Aug 2014
Posts: 11
Default

I try spades, and really can't obtain good assembly in my specific data (I try with other bacteria data and I had got good results). This is the principal reason why I try others assemblers.
freestile is offline   Reply With Quote
Old 11-05-2014, 11:39 PM   #10
SylvainL
Senior Member
 
Location: Geneva

Join Date: Feb 2012
Posts: 174
Default

Quote:
Originally Posted by fahmida View Post
I am also having the same error "[err] All reads within a file must be the same length". I am not sure I understand the rationale behind this. Once the raw reads e.g. 100 or 150nt length are trimmed for quality, adapters etc., read length becomes variable.
In this case, I would advice you to look at the minimum length of your reads and trim all of them to have this minimum length, or depending of your FastQC report, I would go without adapter trimming... Really it is worthwhile trying Edena. As example for a total de novo assembly of Staphylococcus aureus, I got 12 contigs (which stopped because of the rRNA operons).

s.

Last edited by SylvainL; 11-05-2014 at 11:41 PM.
SylvainL is offline   Reply With Quote
Old 11-09-2014, 12:25 PM   #11
freestile
Member
 
Location: Chile

Join Date: Aug 2014
Posts: 11
Default

Thxs for your help. Finally I run edena, but I can't improve rna-seq mapping (~50%) comparing with clc, a5 and spades (~90%). I think that the next step is choice assembly with relative good statistics and at least ~90% rna-seq map.

Regards.
freestile is offline   Reply With Quote
Old 11-10-2014, 02:32 AM   #12
SylvainL
Senior Member
 
Location: Geneva

Join Date: Feb 2012
Posts: 174
Default

Quote:
Originally Posted by freestile View Post
Thxs for your help. Finally I run edena, but I can't improve rna-seq mapping (~50%) comparing with clc, a5 and spades (~90%). I think that the next step is choice assembly with relative good statistics and at least ~90% rna-seq map.

Regards.
When you say your rna-seq mapping is low do you mean only 50% of your rna-seq reads are mapped to your assembly? If yes, did you try a blast on some unmapped reads to see if they really come from your bacterium (or closed to)?

Another question: after rna-seq mapping to your asembly, do you have some regions which are not covered at all?

These RNAseq were performed with rRNA depletion? Do you have rRNA operons on your assembly?
SylvainL is offline   Reply With Quote
Old 11-11-2014, 09:25 AM   #13
freestile
Member
 
Location: Chile

Join Date: Aug 2014
Posts: 11
Default

Quote:
Originally Posted by SylvainL View Post
When you say your rna-seq mapping is low do you mean only 50% of your rna-seq reads are mapped to your assembly? If yes, did you try a blast on some unmapped reads to see if they really come from your bacterium (or closed to)?

Another question: after rna-seq mapping to your asembly, do you have some regions which are not covered at all?

These RNAseq were performed with rRNA depletion? Do you have rRNA operons on your assembly?
Yes, only 50% and I blast some unmapped reads and correspond to bacteria genes.

I have to check te second question

And yes I make rRNA depletion in library preparation.

Regards !
freestile is offline   Reply With Quote
Reply

Tags
assembly, genome, mapping, rna-seq, velvet

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:13 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO