![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
.abi to fasta/fastq conversion script/program? | AppleInformatics | General | 12 | 08-26-2012 11:17 PM |
csfasta to fasta? | brachysclereid | Bioinformatics | 5 | 08-31-2011 10:27 AM |
EMBL like file to FASTA conversion.. | empyrean | Bioinformatics | 1 | 05-14-2011 01:49 AM |
fastq to fasta conversion | kwtennis311 | Bioinformatics | 4 | 06-11-2010 12:06 PM |
Fasta to Ace conversion | Farhat | Bioinformatics | 19 | 05-15-2010 07:08 PM |
![]() |
|
Thread Tools |
![]() |
#21 |
Member
Location: LONDON, UNITED KINGDOM Join Date: Jan 2009
Posts: 44
|
![]()
Thanks for the reply
I have a collection of reads that are 35 nuc long. In all of them there is a '.' in the same position, so when I translate from colorspace to basesapce all of my reads became only 23 nucleotides long plus a tail of 12 N's: TCGAATGACTGTGACGTGCAGTCNNNNNNNNNNNN this is happening to all reads in the file. Maybe something went wrong with the sequencing? For mapping proposes, do you thing that it's better to use the 23nuc reads then the ones with the 'Ns'? I guess if I use the reads with so many N's they can actually map to wrong positions. Is this right? Thank you Ines |
![]() |
![]() |
![]() |
#22 |
Member
Location: LONDON, UNITED KINGDOM Join Date: Jan 2009
Posts: 44
|
![]()
Dear westerman,
Thanks for the reply I have a collection of reads that are 35 nuc long. In all of them there is a '.' in the same position, so when I translate from colorspace to basesapce all of my reads became only 23 nucleotides long plus a tail of 12 N's: TCGAATGACTGTGACGTGCAGTCNNNNNNNNNNNN this is happening to all reads in the file. Maybe something went wrong with the sequencing? For mapping proposes, do you thing that it's better to use the 23nuc reads then the ones with the 'Ns'? I guess if I use the reads with so many N's they can actually map to wrong positions. Is this right? Thank you Ines |
![]() |
![]() |
![]() |
#23 |
Rick Westerman
Location: Purdue University, Indiana, USA Join Date: Jun 2008
Posts: 1,104
|
![]()
Using the 23nuc reads would be good. Even better is to do your mapping in colorspace without doing translation first. That way sequencer errors should be taken care of.
|
![]() |
![]() |
![]() |
#24 |
Member
Location: LONDON, UNITED KINGDOM Join Date: Jan 2009
Posts: 44
|
![]()
That's a good idea, I haven't thought about mapping using colorspace...
To bad bowtie doesn't map with colorspace yet.. Regards, Ines |
![]() |
![]() |
![]() |
#25 |
Senior Member
Location: Sweden Join Date: Mar 2008
Posts: 324
|
![]()
Try bwa (in colorspace) with a seed length of <=22, or better yet a program that allows masking of the position with dots (I think mapreads can do it, maybe others can as well).
|
![]() |
![]() |
![]() |
#26 |
Rick Westerman
Location: Purdue University, Indiana, USA Join Date: Jun 2008
Posts: 1,104
|
![]()
You can mask using the mapreads via the '-p' parameter. Usually this is done via the matching_large_genomes_cmap_save_script.pl program although other SOLiD routines also call mapreads.
E.g., try '-p 1111111111111111111100000000000000000000' or whatever fits your tag length and desired pattern. mapreads will still try to map the full length tag and thus will have problems when the masked part seemingly overhangs the ends. That is, mapreads does chop off the masked part to make a shorter read but rather keeps the read full length. |
![]() |
![]() |
![]() |
#27 | |
Junior Member
Location: Ohio Join Date: Mar 2012
Posts: 3
|
![]() Quote:
Does any one have the link or zipped file for the ABI 'corona lite'? Many thanks. |
|
![]() |
![]() |
![]() |
#28 |
Member
Location: Manchester, UK Join Date: Oct 2009
Posts: 37
|
![]()
Try this for Corona-Lite, i couldn't seem to find it on Life Techs site:
http://skip.ucsc.edu/phage_contigs/hartzog_phage/tools/ |
![]() |
![]() |
![]() |
#29 | |
Junior Member
Location: Ohio Join Date: Mar 2012
Posts: 3
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#30 |
Rick Westerman
Location: Purdue University, Indiana, USA Join Date: Jun 2008
Posts: 1,104
|
![]() |
![]() |
![]() |
![]() |
#31 |
Member
Location: Manchester, UK Join Date: Oct 2009
Posts: 37
|
![]()
I think i used to use 4.2.2. But i don't have the archive to install it (i didn't install it on our cluster).
|
![]() |
![]() |
![]() |
#32 |
Member
Location: london, uk Join Date: Feb 2011
Posts: 10
|
![]()
Hi,
I just want to reiterate how crazy double encoding is! Thought we were having problems with our aligner as the 'reads' weren't mapping to the reference. Why on earth did ABI pick those 4 letters? Why even double encode in the first place?! Thanks Rick! |
![]() |
![]() |
![]() |
#33 |
Guest
Posts: n/a
|
![]()
i hope people realize converting to fastq is probably one of the worst ways to analyze cs data.
|
![]() |
![]() |
#34 |
Member
Location: london, uk Join Date: Feb 2011
Posts: 10
|
![]()
SeqAA, could you describe your workflow?
|
![]() |
![]() |
![]() |
#35 |
Senior Member
Location: Germany Join Date: Oct 2008
Posts: 415
|
![]()
SeqAA - agree.
People, if you want to analyse Solid data properly use colour space. If you're forced into the dark arts of base space conversion i.e. for de novo assembly I would strongly recommend reading the supplements of this paper: Iverson et al. 2012, Science : Untangling genomes from metagenomes .... |
![]() |
![]() |
![]() |
#36 |
Junior Member
Location: Virginia Join Date: Jun 2010
Posts: 2
|
![]()
I generally use galaxy to do most of my conversions
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|