SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Can paired-end mapping produce more reads than single-end ? warrenemmett Bioinformatics 13 03-20-2012 11:10 PM
Using Bfast to align paired end Illumina reads gavin.oliver Bioinformatics 14 01-14-2012 06:51 AM
BFAST for SOLiD paired end reads epigen Bioinformatics 31 09-03-2011 05:20 AM
How to map SOLiD paired end reads by Bfast beliefbio Bioinformatics 1 12-29-2010 12:55 AM
BFAST input format for paired end reads lindseyjane Bioinformatics 5 12-16-2009 07:21 AM

Reply
 
Thread Tools
Old 09-15-2010, 11:52 AM   #1
tanghz
Junior Member
 
Location: UK

Join Date: Sep 2010
Posts: 3
Default BFAST mapping paired end reads.

Hi have a set of data from Illumina:
s_7_1_sequence.txt
s_7_2_sequence.txt

How can I change them into BFAST.fq file.

Here is the ill2fastq.pl comment:

ill2fastq.pl [[ -b <bar code length> | -B ] -n <number of reads> -o <output prefix> -q -s] <input prefix>

But dont know what -q -s stand for.



For single mapping, do I still need to do any format change by this script?


Thanks,
tanghz is offline   Reply With Quote
Old 09-15-2010, 12:43 PM   #2
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by tanghz View Post
Hi have a set of data from Illumina:
s_7_1_sequence.txt
s_7_2_sequence.txt

How can I change them into BFAST.fq file.

Here is the ill2fastq.pl comment:

ill2fastq.pl [[ -b <bar code length> | -B ] -n <number of reads> -o <output prefix> -q -s] <input prefix>

But dont know what -q -s stand for.



For single mapping, do I still need to do any format change by this script?


Thanks,
I have added a description to the latest GIT commit. It is as follows:
Code:
The -q option specifies that qseq.txt files are expected, while 
the -s option specifies that sequence.txt files are expected.
Thank-you for finding these undocumented options.
nilshomer is offline   Reply With Quote
Old 09-15-2010, 12:54 PM   #3
tanghz
Junior Member
 
Location: UK

Join Date: Sep 2010
Posts: 3
Default

Thank you for the clarification,
I have done it.

Could you also clarify if I need to transform the sequence.txt file into fastq by your script?
Can I use the sequence firectly?

thanks

Last edited by tanghz; 09-15-2010 at 01:07 PM.
tanghz is offline   Reply With Quote
Old 09-15-2010, 05:47 PM   #4
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by tanghz View Post
Thank you for the clarification,
I have done it.

Could you also clarify if I need to transform the sequence.txt file into fastq by your script?
Can I use the sequence firectly?

thanks
You will have to convert your input files to the FASTQ format if they are not in that format already.
nilshomer is offline   Reply With Quote
Old 09-20-2010, 07:52 AM   #5
tanghz
Junior Member
 
Location: UK

Join Date: Sep 2010
Posts: 3
Default

Hi , I am using your readgenerate scripts, vert handy. However, I notice the ID of paired read is the same as the first one. e.g.
@readNum=1_strand=+_contig=17_pos=30714265_numends=2_pel=0_rl=36_wrv=1_si=-1_il=0_r1=000000000000000000000000000000000000_r2=0000000000000000000000
00000020020000
GCTCTGAGTATCAGACACACCGTGGCCTCCCCAAGG
+
::::::::::::::::::::::::::::::::::::
@readNum=1_strand=+_contig=17_pos=30714265_numends=2_pel=0_rl=36_wrv=1_si=-1_il=0_r1=000000000000000000000000000000000000_r2=0000000000000000000000
00000020020000
GGCCAAAGGGACACCGGTTTGACAACCAACAGCGTG
+
::::::::::::::::::::::::::::::::::::




There is no reads space info. Did I do sth wrong? How do I parse the second read coordinates for later verification?
thanks.

Last edited by tanghz; 09-20-2010 at 08:36 AM.
tanghz is offline   Reply With Quote
Old 09-20-2010, 08:52 AM   #6
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by tanghz View Post
Hi , I am using your readgenerate scripts, vert handy. However, I notice the ID of paired read is the same as the first one. e.g.
@readNum=1_strand=+_contig=17_pos=30714265_numends=2_pel=0_rl=36_wrv=1_si=-1_il=0_r1=000000000000000000000000000000000000_r2=0000000000000000000000
00000020020000
GCTCTGAGTATCAGACACACCGTGGCCTCCCCAAGG
+
::::::::::::::::::::::::::::::::::::
@readNum=1_strand=+_contig=17_pos=30714265_numends=2_pel=0_rl=36_wrv=1_si=-1_il=0_r1=000000000000000000000000000000000000_r2=0000000000000000000000
00000020020000
GGCCAAAGGGACACCGGTTTGACAACCAACAGCGTG
+
::::::::::::::::::::::::::::::::::::




There is no reads space info. Did I do sth wrong? How do I parse the second read coordinates for later verification?
thanks.
Feel free to dig into the code on this one as I am not supporting that read simulator very heavily; I would be happy to incorporate a patch though,. Otherwise, I would recommend the "dwgsim" tool within http://dnaa.sf.net. The latter is something I am supporting and actively maintaining.
nilshomer is offline   Reply With Quote
Old 04-03-2011, 11:13 PM   #7
Jenzo
Member
 
Location: Bad Nauheim, Germany

Join Date: Feb 2011
Posts: 31
Default

Dear nilshomer,
thanks for your easy-to-use ill2fastq.pl script. Since I'm working on a huge dataset and need to convert from Illumina 1.3+ to fastq I used this script and it worked well the first 20GB, then I got the following error:

Quote:
C:\path-to-file>perl ill2fastq.pl -s my_sequences > C:\path-to-file\file.fastq
ON 0
ON 1
Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_on
e> line 4.
Unicode character 0xfffffffffffffffe is illegal at ill2fastq.pl line 383, <FH_on
e> line 4.
Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_on
e> line 4.
Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_on
e> line 4.
Unicode character 0xfffffffffffffffe is illegal at ill2fastq.pl line 383, <FH_tw
o> line 4.
Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_tw
o> line 4.
Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_tw
o> line 4.
Unicode character 0xfffffffffffffffe is illegal at ill2fastq.pl line 383, <FH_tw
o> line 4.
Unicode character 0xfffffffffffffffe is illegal at ill2fastq.pl line 383, <FH_tw
o> line 4.
Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_tw
o> line 4.
Malformed UTF-8 character (byte 0xff) in reverse at ill2fastq.pl line 397, <FH_t
wo> line 4.
vw'ε\↔█P@▌╚*⌂┴╤Φ╒E▬↔_pZ(*ijJ┼⌂■{x⌂√■∩┐▓╖█╒7ⁿwmu╡┌ⁿ╧╒*■Uy^╥OVΣY^YY┘,v♣
╛╦Qα▌ƒ╟┐■≈+╔╬╣εα≈▒∩╛∩█╢∩╦≥☺▲╒cU5╧mki█√≡τ*$$▀▌▐`╘(Q╣,+☺OE╢╦╦▌p→.X▓
╢"uL ♥mysequences_2_sequence.txt ┤}█VδJσ√∙~∞╤ !↨╓╙1▲}6►‼▀]σ*∩**╓
└r♣♫8═Z►╪→ƪ⌐╩⌂■w■╧*∙Φm╛ܲ┴?╫⌂fW│┼*║┐╫*◄G▼⌂ⁿ╟*>'╣║√∙╟⌂ⁿτΣ⌡jN‼5▒╩^≡/
▲╫xy+→'‼k∞♣7Sk<╗║a╖'╓╪♂╓zjg7δ♂⌐∞%Oε╔╡∟╛⌐=┘♂.⌂↑πV^,$9ZσA▓■Gu^▄,.s╝α║₧Xπ
↑yj╜α5₧├▒^]$ţH■╣╞A%zF╙δ|{L☻τ╫╖↨╥┘Kn&≈k∙‼▼└‼╔█y╣\\╞╠^≡─↓{■τz=╗♦*:
Died at ill2fastq.pl line 229.
I tried to figure out, what happened here, but was only suggest that the problem lies in perls encoding of strings? (http://jeremy.zawodny.com/blog/archives/010546.html and http://perldoc.perl.org/perldiag.htm...ter-%28%25s%29)
Perhaps someone has an idea or can provide a fast script to do the conversion fast and correct! Thanks a lot! Yours Jenzo
Jenzo is offline   Reply With Quote
Old 04-28-2011, 06:48 PM   #8
kellywilliams
Junior Member
 
Location: Sydney, Australia

Join Date: Oct 2010
Posts: 8
Default ill2fastq.pl failed

Hi,

I am having difficulty using ill2fastq.pl. I have successfully used BFAST for alignment of all of my SOLiD data, but cannot get step 1 to work for my Illumina data. I am using bfast-0.6.4e

This is what happens when I try to run the perl script (my two files are names 100247_1_sequence.txt and 100247_2_sequence.txt):

Code:
$ perl ill2fastq.pl -s 100247
ON 0
Malformed UTF-8 character (byte 0xff) in reverse at ill2fastq.pl line 395, <FH_two> line 4.
@HWUSI-E@HWUSI-EAS570R_0028:6:1:1311:1079#0/2
Died at ill2fastq.pl line 227
.

If you can help me out that would be great! Thanks in advance,

Kelly
kellywilliams is offline   Reply With Quote
Old 04-28-2011, 08:24 PM   #9
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Googling "Malformed UTF-8 character" there seems to be something wrong with your encoding. What is your platform/OS?
nilshomer is offline   Reply With Quote
Old 04-28-2011, 10:06 PM   #10
kellywilliams
Junior Member
 
Location: Sydney, Australia

Join Date: Oct 2010
Posts: 8
Default

Quote:
Originally Posted by nilshomer View Post
Googling "Malformed UTF-8 character" there seems to be something wrong with your encoding. What is your platform/OS?
I have a 64-bit linux running RedHat. I just tried it again using bfast-0.6.5a and the same thing happened.
kellywilliams is offline   Reply With Quote
Old 04-29-2011, 06:29 AM   #11
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Can you try on a different machine?
nilshomer is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:16 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO