SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
correcting homopolymer run errors 454andSolid Bioinformatics 4 05-02-2010 08:40 AM
Calculation of Homopolymer errors pcg 454 Pyrosequencing 0 04-12-2010 08:15 AM
Homopolymer run errors, polyA bias sulfobus 454 Pyrosequencing 6 04-09-2010 04:15 AM
Mapping SOliD reads to a Newbler 454 alignment to correct errors Bukowski Bioinformatics 0 03-09-2010 02:20 AM
454 homopolymer errors or???? ian Adams 454 Pyrosequencing 9 12-02-2008 01:46 AM

Reply
 
Thread Tools
Old 01-14-2010, 03:45 AM   #1
coldturkey
Member
 
Location: Brisbane

Join Date: Nov 2008
Posts: 51
Default Using solexa to correct 454 homopolymer errors

Hello All,

I am currently attempting to resolve homoploymer and other errors in my 454 assembly using solexa data.
If anyone has any suggestions of the best tools and/or pipelines to use for this I would greatly appreciate any input

regards

Brian
coldturkey is offline   Reply With Quote
Old 01-14-2010, 05:16 AM   #2
nickloman
Senior Member
 
Location: Birmingham, UK

Join Date: Jul 2009
Posts: 356
Default

One tip is to simply use a short-read aligner (MAQ, Bowtie, BWA, Novoalign etc.) to align the Solexa reads against your 454 assembly. The settings will decide how helpful this is, but you should be able to find SNPs or indels around incorrectly called homopolymers. Or just view the alignment in regions where you have homopolymers of a certain length. This relies on paired-end Solexa data or Solexa fragments perhaps 50 or 75 bases in length, I find it didn't work well with fragment libraries of 36 bases.

Alternatively you could try a hybrid assembly in MIRA.

Let us know how you get on.
nickloman is offline   Reply With Quote
Old 01-19-2010, 08:54 PM   #3
Torst
Senior Member
 
Location: The University of Melbourne, AUSTRALIA

Join Date: Apr 2008
Posts: 275
Default

Quote:
Originally Posted by coldturkey View Post
I am currently attempting to resolve homoploymer and other errors in my 454 assembly using solexa data.
If anyone has any suggestions of the best tools and/or pipelines to use for this I would greatly appreciate any input
Nesoni can do this for you. Feed it your 454 contigs as the "reference" and the Illumina reads as the "reads". Run "nesoni shrimp" then "nesoni consensus". The working folder will contain "reference_consensus.fa" or similar which is effectively the "corrected" 454 contigs.

Download site: http://www.vicbioinformatics.com/nesoni.shtml
Torst is offline   Reply With Quote
Old 08-12-2010, 01:28 PM   #4
glacerda
Member
 
Location: Brazil

Join Date: Aug 2008
Posts: 27
Default

removed...

Last edited by glacerda; 08-12-2010 at 01:42 PM. Reason: forgot to quote
glacerda is offline   Reply With Quote
Old 08-12-2010, 01:43 PM   #5
glacerda
Member
 
Location: Brazil

Join Date: Aug 2008
Posts: 27
Default nesoni consensus error

Quote:
Originally Posted by Torst View Post
Nesoni can do this for you. Feed it your 454 contigs as the "reference" and the Illumina reads as the "reads". Run "nesoni shrimp" then "nesoni consensus". The working folder will contain "reference_consensus.fa" or similar which is effectively the "corrected" 454 contigs.

Download site: http://www.vicbioinformatics.com/nesoni.shtml

Hi Torst, this tool looks very interesting.
I have tried to run "nesoni consensus" and it prints the following error messages. Do you know what is happening? It looks like there is a missing module, called statistics, which is not included in the nesoni package. Do you know how to solve that? Thank you very much.

Traceback (most recent call last):
File "/usr/bin/nesoni", line 15, in <module>
sys.exit(nesoni.main(sys.argv[1:]))
File "/usr/lib/python2.6/site-packages/nesoni/__init__.py", line 155, in main
plot
File "/usr/lib/python2.6/site-packages/nesoni/grace.py", line 137, in execute
commands[args[start]](args[start+1:end])
File "/usr/lib/python2.6/site-packages/nesoni/__init__.py", line 80, in consensus
grace.load('consensus').main(args)
File "/usr/lib/python2.6/site-packages/nesoni/grace.py", line 22, in load
m = __import__(module_name, globals())
File "/usr/lib64/python2.6/site-packages/pyximport/pyximport.py", line 328, in load_module
self.pyxbuild_dir)
File "/usr/lib64/python2.6/site-packages/pyximport/pyximport.py", line 181, in load_module
mod = imp.load_dynamic(name, so_path)
File "consensus.pyx", line 11, in init nesoni.consensus (/root/.pyxbld/temp.linux-x86_64-2.6/pyrex/consensus.c:24117)
from nesoni import io, grace, shrimp, statistics
ImportError: Building module failed: ["AttributeError: 'module' object has no attribute 'statistics'\n"]
glacerda is offline   Reply With Quote
Old 08-12-2010, 09:54 PM   #6
pfh
Junior Member
 
Location: Melbourne

Join Date: May 2008
Posts: 7
Default

Ah. Oops. distutils didn't automatically update the MANIFEST for some reason.

Try this: http://bioinformatics.net.au/nesoni-0.35.tar.gz


Note that "nesoni shrimp" and "nesoni consensus" use SHRiMP 1.0 (rmapper-ls). To use SHRiMP 2.0 (gmapper-ls), use "nesoni samshrimp" and "nesoni samconsensus" -- but this is very new code, not thoroughly tested.

Your corrected contigs will be in a file called consensus_masked.fa -- this file defaults back to the reference sequence (in lowercase) if a consensus can't be called from the aligned reads.

Last edited by pfh; 08-12-2010 at 09:57 PM.
pfh is offline   Reply With Quote
Old 08-15-2010, 04:01 PM   #7
glacerda
Member
 
Location: Brazil

Join Date: Aug 2008
Posts: 27
Default nesoni samconsensus

Quote:
Originally Posted by pfh View Post
Ah. Oops. distutils didn't automatically update the MANIFEST for some reason.

Try this: http://bioinformatics.net.au/nesoni-0.35.tar.gz


Note that "nesoni shrimp" and "nesoni consensus" use SHRiMP 1.0 (rmapper-ls). To use SHRiMP 2.0 (gmapper-ls), use "nesoni samshrimp" and "nesoni samconsensus" -- but this is very new code, not thoroughly tested.

Your corrected contigs will be in a file called consensus_masked.fa -- this file defaults back to the reference sequence (in lowercase) if a consensus can't be called from the aligned reads.
Hi PFH, thank you very much for your help and for making this software package available for the community. The version 0.35 has solved the error I pointed, but there was another error in the samconsensus. The error occurs after it has printed all the *.userplot files. There follows the error message.

Traceback (most recent call last):
File "/usr/bin/nesoni", line 15, in <module>
sys.exit(nesoni.main(sys.argv[1:]))
File "/usr/lib/python2.6/site-packages/nesoni/__init__.py", line 177, in main
recombination
File "/usr/lib/python2.6/site-packages/nesoni/grace.py", line 160, in execute
commands[args[start]](args[start+1:end])
File "/usr/lib/python2.6/site-packages/nesoni/__init__.py", line 127, in samconsensus
grace.load('sam').consensus_main(args, True)
File "/usr/lib/python2.6/site-packages/nesoni/sam.py", line 1122, in consensus_main
references[name] = Ref_seq( seq.upper() )
File "/usr/lib/python2.6/site-packages/nesoni/sam.py", line 955, in __init__
self.base_counts = [ consensus.EMPTY_EVIDENCE ] * len(seq)
AttributeError: 'module' object has no attribute 'EMPTY_EVIDENCE'
Waiting for data... (interrupt to abort)
glacerda is offline   Reply With Quote
Old 08-15-2010, 06:01 PM   #8
pfh
Junior Member
 
Location: Melbourne

Join Date: May 2008
Posts: 7
Default

This might be a minor difference between Cython versions, which version are you using? I have 0.12.1.

Anyway, this might fix it:

http://bioinformatics.net.au/nesoni-0.36.tar.gz
pfh is offline   Reply With Quote
Old 11-19-2010, 02:55 AM   #9
colindaven
Senior Member
 
Location: Germany

Join Date: Oct 2008
Posts: 403
Default

Dear nesoni developers,

I ran nesoni with
nesoni samshrimp
and
nesoni samconsensus
on a reference set of 454 contigs with 36 bp Illumina reads mapped to them.

Results are something like this -

contig00006 shrimp_consensus variation 7779 7780 . + . product=Insertion: .C. ("C"x32 "-"x10)
contig00006 shrimp_consensus variation 27322 27322 . + . product=Base deleted: C ("-"x22 "C"x5)
contig00006 shrimp_consensus variation 64220 64220 . + . product=Base deleted: T ("-"x29 "T"x7)
contig00011 shrimp_consensus variation 85 85 . + . product=Substitution: C became A ("A"x10)
contig00011 shrimp_consensus variation 100 100 . + . product=Substitution: T became C ("C"x9)
contig00011 shrimp_consensus variation 19231 19231 . + . product=Base deleted: C ("-"x23 "C"x8)
contig00012 shrimp_consensus variation 34618 34618 . + . product=Base deleted: A ("-"x26 "A"x10 "C"x1)
contig00015 shrimp_consensus variation 78530 78530 . + . product=Base deleted: A ("-"x30 "A"x11)
contig00019 shrimp_consensus variation 122482 122482 . + . product=Base deleted: A ("-"x25 "A"x9)
contig00020 shrimp_consensus variation 62624 62624 . + . product=Substitution: T became V ("G"x3 "C"x1 "A"x1)
contig00024 shrimp_consensus variation 72892 72892 . + . product=Base deleted: T ("-"x22 "C"x4 "T"x1)
contig00024 shrimp_consensus variation 143531 143531 . + . product=Base deleted: A ("-"x27 "A"x8)
contig00030 shrimp_consensus variation 13275 13275 . + . product=Substitution: T became A ("A"x21 "T"x2)
contig00030 shrimp_consensus variation 13279 13279 . + . product=Substitution: C became T ("T"x21 "C"x2)
contig00030 shrimp_consensus variation 13283 13283 . + . product=Substitution: A became C ("C"x21 "A"x2)
contig00042 shrimp_consensus variation 57483 57483 . + . product=Substitution: A became Y ("C"x4 "T"x1)
contig00045 shrimp_consensus variation 20062 20062 . + . product=Base deleted: T ("-"x31 "T"x14)
contig00046 shrimp_consensus variation 43525 43525 . + . product=Base deleted: T ("-"x35 "T"x16)
contig00048 shrimp_consensus variation 69141 69141 . + . product=Base deleted: T ("-"x41 "T"x14)
contig00051 shrimp_consensus variation 85164 85165 . + . product=Insertion: .T. ("T"x21 "-"x5)
contig00057 shrimp_consensus variation 11841 11841 . + . product=Substitution: T became V ("C"x3 "G"x1 "A"x1)
contig00057 shrimp_consensus variation 86311 86311 . + . product=Substitution: C became G ("G"x26)
contig00071 shrimp_consensus variation 14588 14588 . + . product=Substitution: G became T ("T"x29 "G"x10)
contig00071 shrimp_consensus variation 14591 14591 . + . product=Substitution: G became A ("A"x33 "G"x7)
contig00073 shrimp_consensus variation 13829 13829 . + . product=Base deleted: A ("-"x40 "A"x18)
contig00088 shrimp_consensus variation 214369 214370 . + . product=Insertion: .G. ("G"x26 "-"x4)
contig00091 shrimp_consensus variation 42090 42090 . + . product=Base deleted: T ("-"x12)
contig00099 shrimp_consensus variation 99157 99157 . + . product=Base deleted: T ("-"x29 "T"x5)
contig00100 shrimp_consensus variation 39818 39819 . + . product=Insertion: .T. ("T"x39 "-"x12)

I was actually expecting more corrections for a 6 Mbp genome. I haven't systematically surveyed homopolymer errors in the 454 contigs but would expect more than the ca. 46 corrections here.

What experience do you have with
-coverage of contigs
-coverage of reads
-number of corrections per megabase

cheers
colin
colindaven is offline   Reply With Quote
Old 11-24-2010, 05:18 AM   #10
yvan.wenger
Member
 
Location: Switzerland

Join Date: Aug 2009
Posts: 30
Default

Hello everybody,

I installed nesoni 0.4 but get the following error when starting it with a test dataset: can anybody spot the mistake?

Best,

Yvan

>nesoni samshrimp nesoni_output crebs.fa reads ./GEX-1.fq
Running gmapper-ls -E -T -w 200% -n 2 -N 8 -X
Traceback (most recent call last):
File "/usr/local/bin/nesoni", line 15, in <module>
sys.exit(nesoni.main(sys.argv[1:]))
File "/usr/local/lib/python2.6/dist-packages/nesoni/__init__.py", line 192, in main
recombination
File "/usr/local/lib/python2.6/dist-packages/nesoni/grace.py", line 160, in execute
commands[args[start]](args[start+1:end])
File "/usr/local/lib/python2.6/dist-packages/nesoni/__init__.py", line 126, in samshrimp
grace.load('sam').shrimp2_main(args)
File "/usr/local/lib/python2.6/dist-packages/nesoni/sam.py", line 602, in shrimp2_main
stderr=log_file)
File "/usr/local/lib/python2.6/dist-packages/nesoni/sam.py", line 40, in run
close_fds=True,
File "/usr/lib/python2.6/subprocess.py", line 633, in __init__
errread, errwrite)
File "/usr/lib/python2.6/subprocess.py", line 1139, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
[samopen] no @SQ lines in the header.
[sam_read1] missing header? Abort !
yvan.wenger is offline   Reply With Quote
Old 11-24-2010, 05:41 AM   #11
colindaven
Senior Member
 
Location: Germany

Join Date: Oct 2008
Posts: 403
Default

I had a similar error I think -
check shrimp2 is installed and accessible from the nesoni directory on the cmd line, i.e. is in your system path
colindaven is offline   Reply With Quote
Old 11-24-2010, 02:09 PM   #12
Torst
Senior Member
 
Location: The University of Melbourne, AUSTRALIA

Join Date: Apr 2008
Posts: 275
Default

Quote:
Originally Posted by colindaven View Post
I was actually expecting more corrections for a 6 Mbp genome. I haven't systematically surveyed homopolymer errors in the 454 contigs but would expect more than the ca. 46 corrections here.
I have found that with old 454 GS20 and GS FLX data, we were correcting about 20+ errors per megabase. However, with newer GS FLX and Titanium data (which includes higher yield/coverage), combined with newer versions of Newbler, this has been descreasing to ~10 errors per megabase.

So your 46/6 ~ 7.5 errors / Mbp is expected if your data/results are from recent data.
Torst is offline   Reply With Quote
Old 11-24-2010, 02:44 PM   #13
pfh
Junior Member
 
Location: Melbourne

Join Date: May 2008
Posts: 7
Default

The relevant program from SHRiMP 2 will be "gmapper-ls".

It looks like this is failing to run, and then samtools is discombobulated by receiving empty input.
pfh is offline   Reply With Quote
Old 11-25-2010, 12:36 AM   #14
yvan.wenger
Member
 
Location: Switzerland

Join Date: Aug 2009
Posts: 30
Default

Hello Colin and pfh,

Yes, thanks, that was the problem, I did not correctly added shrimp2 to my path...

Best,

Yvan
yvan.wenger is offline   Reply With Quote
Old 11-23-2011, 12:32 AM   #15
sirius
Junior Member
 
Location: Taiwan

Join Date: Nov 2011
Posts: 2
Default

Hello everybody~
I installed nesoni-0.58 and SHRiMP_2_2_1.
And ran "nesoni samshrimp"

>nesoni samshrimp test_out J_mapper.fasta J.fq
Error: No read files given

anyone can tell me what problem is ??

thanks a lot
sirius is offline   Reply With Quote
Old 11-24-2011, 01:50 PM   #16
Torst
Senior Member
 
Location: The University of Melbourne, AUSTRALIA

Join Date: Apr 2008
Posts: 275
Default

Quote:
Originally Posted by sirius View Post
I installed nesoni-0.58 and SHRiMP_2_2_1.
And ran "nesoni samshrimp"
>nesoni samshrimp test_out J_mapper.fasta J.fq
Error: No read files given
anyone can tell me what problem is ??
if you just type "nesoni samshrimp" you will get the help:

Code:
    nesoni samshrimp: output_directory [options] \
        reference.fa/.gbk [...] \
        [reads: single.fq | single.fa [...]] \
        [interleaved: interleaved.fq | interleaved.fa [...]] \
        [pairs: left.fq right.fq | left.fa right.fa] \
        [shrimp-options: ...options to pass directly to gmapper-ls... ]
You have not supplied the read file correctly. Is it interleaved pairs, or single shotgun reads?
If single end, this is the command you want:

nesoni samshrimp: test_out J_mapper.fasta reads: J.fq
Torst is offline   Reply With Quote
Old 11-24-2011, 10:57 PM   #17
sirius
Junior Member
 
Location: Taiwan

Join Date: Nov 2011
Posts: 2
Default

Get it~~
Thanks for your help
sirius is offline   Reply With Quote
Old 05-03-2013, 02:14 AM   #18
Mona
Member
 
Location: Uppsala

Join Date: Feb 2010
Posts: 27
Default

Hello,

I am trying to install nesoni,
I installed bowtie2 as the manual says shrimp or bowtie2 in the requirements. I then installed nesoni but when i type the command nesoni, it gives the following error message:

Traceback (most recent call last):
File "/usr/local/bin/nesoni", line 8, in <module>
load_entry_point('nesoni==0.101', 'console_scripts', 'nesoni')()
File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/pkg_resources.py", line 271, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/pkg_resources.py", line 2174, in load_entry_point
return ep.load()
File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/pkg_resources.py", line 1907, in load
entry = __import__(self.module_name, globals(),globals(), ['__name__'])
File "/Library/Python/2.6/site-packages/nesoni-0.101-py2.6.egg/nesoni/__init__.py", line 8, in <module>
from reference_directory import Make_reference
File "/Library/Python/2.6/site-packages/nesoni-0.101-py2.6.egg/nesoni/reference_directory.py", line 5, in <module>
from nesoni import io, grace, config, annotation
File "/Library/Python/2.6/site-packages/nesoni-0.101-py2.6.egg/nesoni/io.py", line 837, in <module>
class Grouped_table(collections.OrderedDict):
AttributeError: 'module' object has no attribute 'OrderedDict'


As i m not a unix expert, so need help...
Mona is offline   Reply With Quote
Old 05-03-2013, 09:48 PM   #19
pfh
Junior Member
 
Location: Melbourne

Join Date: May 2008
Posts: 7
Default

Hi Mona,

Looks like OrderedDict was added in Python 2.7. I need to update the requirements in the README!

Ok, two options:

1. Install Python 2.7. I don't know enough about package management on OS X to know how hard this is.

2. (preferable) Install virtualenv, eg by following the instructions on virtualenv.org. Download and untar the latest PyPy from pypy.org. Using the instructions in the nesoni README, create a virtualenv using the pypy you just untarred, and install BioPython and nesoni in it.
pfh is offline   Reply With Quote
Old 05-06-2013, 02:12 AM   #20
Mona
Member
 
Location: Uppsala

Join Date: Feb 2010
Posts: 27
Default

Thanks pfh,

To me installing python 2.7 seems to be an easy option, I have done that now, but I think nisoni is still using python 2.6 as I am getting the same error. May be I should try to un install python 2.6 so it will then be using 2.7.
Mona is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:21 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO