Seqanswers Leaderboard Ad

**nilshomer** · 09-15-2010, 12:43 PM

Originally posted by tanghz View Post

Hi have a set of data from Illumina:
s_7_1_sequence.txt
s_7_2_sequence.txt

How can I change them into BFAST.fq file.

Here is the ill2fastq.pl comment:

ill2fastq.pl [[ -b <bar code length> | -B ] -n <number of reads> -o <output prefix> -q -s] <input prefix>

But dont know what -q -s stand for.

For single mapping, do I still need to do any format change by this script?

Thanks,

I have added a description to the latest GIT commit. It is as follows:

Code:

The -q option specifies that qseq.txt files are expected, while 
the -s option specifies that sequence.txt files are expected.

Thank-you for finding these undocumented options.

**tanghz** · 09-15-2010, 12:54 PM

Thank you for the clarification,
I have done it.

Could you also clarify if I need to transform the sequence.txt file into fastq by your script?
Can I use the sequence firectly?

thanks

**nilshomer** · 09-15-2010, 05:47 PM

Originally posted by tanghz View Post

Thank you for the clarification,
I have done it.

Could you also clarify if I need to transform the sequence.txt file into fastq by your script?
Can I use the sequence firectly?

thanks

You will have to convert your input files to the FASTQ format if they are not in that format already.

**tanghz** · 09-20-2010, 07:52 AM

Hi , I am using your readgenerate scripts, vert handy. However, I notice the ID of paired read is the same as the first one. e.g.
@readNum=1_strand=+_contig=17_pos=30714265_numends=2_pel=0_rl=36_wrv=1_si=-1_il=0_r1=000000000000000000000000000000000000_r2=0000000000000000000000
00000020020000
GCTCTGAGTATCAGACACACCGTGGCCTCCCCAAGG
+
::::::::::::::::::::::::::::::::::::
@readNum=1_strand=+_contig=17_pos=30714265_numends=2_pel=0_rl=36_wrv=1_si=-1_il=0_r1=000000000000000000000000000000000000_r2=0000000000000000000000
00000020020000
GGCCAAAGGGACACCGGTTTGACAACCAACAGCGTG
+
::::::::::::::::::::::::::::::::::::

There is no reads space info. Did I do sth wrong? How do I parse the second read coordinates for later verification?
thanks.

**nilshomer** · 09-20-2010, 08:52 AM

Originally posted by tanghz View Post

Hi , I am using your readgenerate scripts, vert handy. However, I notice the ID of paired read is the same as the first one. e.g.
@readNum=1_strand=+_contig=17_pos=30714265_numends=2_pel=0_rl=36_wrv=1_si=-1_il=0_r1=000000000000000000000000000000000000_r2=0000000000000000000000
00000020020000
GCTCTGAGTATCAGACACACCGTGGCCTCCCCAAGG
+
::::::::::::::::::::::::::::::::::::
@readNum=1_strand=+_contig=17_pos=30714265_numends=2_pel=0_rl=36_wrv=1_si=-1_il=0_r1=000000000000000000000000000000000000_r2=0000000000000000000000
00000020020000
GGCCAAAGGGACACCGGTTTGACAACCAACAGCGTG
+
::::::::::::::::::::::::::::::::::::

There is no reads space info. Did I do sth wrong? How do I parse the second read coordinates for later verification?
thanks.

Feel free to dig into the code on this one as I am not supporting that read simulator very heavily; I would be happy to incorporate a patch though,. Otherwise, I would recommend the "dwgsim" tool within http://dnaa.sf.net. The latter is something I am supporting and actively maintaining.

**Jenzo** · 04-03-2011, 11:13 PM

Dear nilshomer,
thanks for your easy-to-use ill2fastq.pl script. Since I'm working on a huge dataset and need to convert from Illumina 1.3+ to fastq I used this script and it worked well the first 20GB, then I got the following error:

C:\path-to-file>perl ill2fastq.pl -s my_sequences > C:\path-to-file\file.fastq
ON 0
ON 1
Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_on
e> line 4.
Unicode character 0xfffffffffffffffe is illegal at ill2fastq.pl line 383, <FH_on
e> line 4.
Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_on
e> line 4.
Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_on
e> line 4.
Unicode character 0xfffffffffffffffe is illegal at ill2fastq.pl line 383, <FH_tw
o> line 4.
Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_tw
o> line 4.
Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_tw
o> line 4.
Unicode character 0xfffffffffffffffe is illegal at ill2fastq.pl line 383, <FH_tw
o> line 4.
Unicode character 0xfffffffffffffffe is illegal at ill2fastq.pl line 383, <FH_tw
o> line 4.
Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_tw
o> line 4.
Malformed UTF-8 character (byte 0xff) in reverse at ill2fastq.pl line 397, <FH_t
wo> line 4.
vw'ε\ê↔█P@▌╚*⌂┴╤§Φ╒E▬ª↔_påZ(*ijJ┼⌂■{x⌂√■∩┐▓╖¢█╒7ⁿw²mu╡┌ⁿ╧╒*¡■U¶y^╥OVΣY^µYYû┘«,v♣
╛╦±Qαë▌ƒ╟┐■≈+╔╬╣εαú≈▒∩¢╛∩█╢∩╦≥☺▲╒cU5╧mk£i█√≡τ±»*$$▀▌▐É`╘½(Q╣,+±☺OE╢╦╦▌p→.ªX«▓
╢"uL ♥mysequences_2_sequence.txt ┤}█VδJ¼σ√∙ì~∞╤ú !↨╓╙1▲º}6ù►áê‼▀]σ*∩**é╓¡
£└r♣♫8¼═Z►╪→èóÆªñ⌐╩⌂■w·■÷╧*∙»Φm╛Ü²┴?╫⌂fW│┼*║·┐╫*◄G¢▼⌂ⁿ╟*>'╣║√∙╟⌂ⁿτêΣ⌡jNé‼5▒╩^≡/¶
▲╫xy+→'‼k∞♣7Sk<╗║a╖ê'╓╪♂╓zjìg7δ♂⌐∞%Oε╔½╡∟╛⌐=┘♂.⌂ß↑πV^,û$9ÜZσA▓■àÖGu^▄,.s·╝α║₧Xπ¢
ò↑yjì╜αë5₧├▒^]$Å£H■╣╞A¥%zF╙δ|{æL☻Æτ╫╖↨╥┘Kn&≈ìkë∙‼▼└‼╔ô█y╣\\╞╠^≡─↓{■τz=╗♦*:
Died at ill2fastq.pl line 229.

I tried to figure out, what happened here, but was only suggest that the problem lies in perls encoding of strings? (http://jeremy.zawodny.com/blog/archives/010546.html and http://perldoc.perl.org/perldiag.htm...ter-%28%25s%29)
Perhaps someone has an idea or can provide a fast script to do the conversion fast and correct! Thanks a lot! Yours Jenzo

**kellywilliams** · 04-28-2011, 06:48 PM

ill2fastq.pl failed

Hi,

I am having difficulty using ill2fastq.pl. I have successfully used BFAST for alignment of all of my SOLiD data, but cannot get step 1 to work for my Illumina data. I am using bfast-0.6.4e

This is what happens when I try to run the perl script (my two files are names 100247_1_sequence.txt and 100247_2_sequence.txt):

Code:

$ perl ill2fastq.pl -s 100247
ON 0
Malformed UTF-8 character (byte 0xff) in reverse at ill2fastq.pl line 395, <FH_two> line 4.
@HWUSI-E@HWUSI-EAS570R_0028:6:1:1311:1079#0/2
Died at ill2fastq.pl line 227

.

If you can help me out that would be great! Thanks in advance,

Kelly

**nilshomer** · 04-28-2011, 08:24 PM

Googling "Malformed UTF-8 character" there seems to be something wrong with your encoding. What is your platform/OS?

**kellywilliams** · 04-28-2011, 10:06 PM

Originally posted by nilshomer View Post

Googling "Malformed UTF-8 character" there seems to be something wrong with your encoding. What is your platform/OS?

I have a 64-bit linux running RedHat. I just tried it again using bfast-0.6.5a and the same thing happened.

**nilshomer** · 04-29-2011, 06:29 AM

Can you try on a different machine?

Topics	Statistics	Last Post
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, Yesterday, 06:57 AM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 06:57 AM
Enhanced Neoantigen Detection: Introducing NeoHunter by seqadmin Started by seqadmin, 05-06-2024, 07:17 AM	0 responses 16 views 0 likes	Last Post by seqadmin 05-06-2024, 07:17 AM
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 19 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM

Seqanswers Leaderboard Ad

Announcement

BFAST mapping paired end reads.

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News