SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
multiple rounds of Pilon polishing: need new bam file? ssully Bioinformatics 1 04-28-2019 11:09 PM
Pilon ambiguous base handling? ewilbanks Bioinformatics 0 08-03-2018 11:25 AM
Nanopolish and pilon are calling wrong bases iltisanni Oxford Nanopore 0 06-26-2018 04:20 AM
MaSuRCA Febin@GC Introductions 0 04-27-2017 07:43 PM
MaSurCA 454 PE + Illumina 2x250 PE Assembly Issue David_Cleary Bioinformatics 3 02-26-2015 02:29 AM

Reply
 
Thread Tools
Old 09-30-2019, 11:29 AM   #1
jpummil
Member
 
Location: Fayetteville, AR

Join Date: Apr 2014
Posts: 85
Default Using Pilon after MaSuRCA Assembly

So, lots of good threads on using Pilon...but my question is specifically about the use of Pilon after using MaSuRCA for assembly. MaSuRCA specifically states that the unmodified, raw reads be used without any sort of manipulation:

"IMPORTANT! Do not use third party tools to pre-process the Illumina data before providing it to MaSuRCA, unless you are absolutely sure you know exactly what the preprocessing tool does. Do not do any trimming, cleaning or error correction. This will likely deteriorate the assembly."

Again, no problem for the assembly portion. BUT....when I move to the Pilon polishing step(s), do I want to try and use the raw reads as they are for polishing?!? Seems that, depending on how Pilon interprets that data, it might introduce some of the low quality aspects back into the assembly? Or does the mapping of the raw to the assembly to create the .bam resolve any such possibilities?
jpummil is offline   Reply With Quote
Old 10-02-2019, 02:53 AM   #2
Gopo
Member
 
Location: Louisiana

Join Date: Nov 2013
Posts: 38
Default

Personally, I would do quality and adapter trimming (and maybe error-correction) of the raw Illumina reads as MaSuRCA pretty much does this from what I understand and use the quality and adapter trimmed reads for polishing with Pilon.
Gopo is offline   Reply With Quote
Old 10-02-2019, 06:23 AM   #3
jpummil
Member
 
Location: Fayetteville, AR

Join Date: Apr 2014
Posts: 85
Default

Kind of leaning that way for sure. I dug around a bit in the MaSuRCA documentation and have identified and located the "corrected" PE data created during assembly:

pe.cor.fa - error corrected PE reads. The ordering of the reads is arbitrary, but the pairs are guaranteed to appear together. No quality scores

So, the file looks more like merged PE data and is now a .fasta, though I don't know if Pilon was using the quality scores or not...

I figure it's worth a try anyway....maybe compare it with the raw PE data and see what looks better in a Quast analysis...

Last edited by jpummil; 10-02-2019 at 06:27 AM.
jpummil is offline   Reply With Quote
Old 10-03-2019, 09:09 AM   #4
jpummil
Member
 
Location: Fayetteville, AR

Join Date: Apr 2014
Posts: 85
Default

Haha! Well, THAT didn't work! BWA did OK with the indexing, etc...but Pilon quickly bailed:

Suppressed: java.lang.IllegalStateException: Inappropriate call if not paired read

I'll go back and manually quality trim and error correct the raw files.
jpummil is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:17 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO