Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BFAST mapping paired end reads.

    Hi have a set of data from Illumina:
    s_7_1_sequence.txt
    s_7_2_sequence.txt

    How can I change them into BFAST.fq file.

    Here is the ill2fastq.pl comment:

    ill2fastq.pl [[ -b <bar code length> | -B ] -n <number of reads> -o <output prefix> -q -s] <input prefix>

    But dont know what -q -s stand for.



    For single mapping, do I still need to do any format change by this script?


    Thanks,

  • #2
    Originally posted by tanghz View Post
    Hi have a set of data from Illumina:
    s_7_1_sequence.txt
    s_7_2_sequence.txt

    How can I change them into BFAST.fq file.

    Here is the ill2fastq.pl comment:

    ill2fastq.pl [[ -b <bar code length> | -B ] -n <number of reads> -o <output prefix> -q -s] <input prefix>

    But dont know what -q -s stand for.



    For single mapping, do I still need to do any format change by this script?


    Thanks,
    I have added a description to the latest GIT commit. It is as follows:
    Code:
    The -q option specifies that qseq.txt files are expected, while 
    the -s option specifies that sequence.txt files are expected.
    Thank-you for finding these undocumented options.

    Comment


    • #3
      Thank you for the clarification,
      I have done it.

      Could you also clarify if I need to transform the sequence.txt file into fastq by your script?
      Can I use the sequence firectly?

      thanks
      Last edited by tanghz; 09-15-2010, 01:07 PM.

      Comment


      • #4
        Originally posted by tanghz View Post
        Thank you for the clarification,
        I have done it.

        Could you also clarify if I need to transform the sequence.txt file into fastq by your script?
        Can I use the sequence firectly?

        thanks
        You will have to convert your input files to the FASTQ format if they are not in that format already.

        Comment


        • #5
          Hi , I am using your readgenerate scripts, vert handy. However, I notice the ID of paired read is the same as the first one. e.g.
          @readNum=1_strand=+_contig=17_pos=30714265_numends=2_pel=0_rl=36_wrv=1_si=-1_il=0_r1=000000000000000000000000000000000000_r2=0000000000000000000000
          00000020020000
          GCTCTGAGTATCAGACACACCGTGGCCTCCCCAAGG
          +
          ::::::::::::::::::::::::::::::::::::
          @readNum=1_strand=+_contig=17_pos=30714265_numends=2_pel=0_rl=36_wrv=1_si=-1_il=0_r1=000000000000000000000000000000000000_r2=0000000000000000000000
          00000020020000
          GGCCAAAGGGACACCGGTTTGACAACCAACAGCGTG
          +
          ::::::::::::::::::::::::::::::::::::




          There is no reads space info. Did I do sth wrong? How do I parse the second read coordinates for later verification?
          thanks.
          Last edited by tanghz; 09-20-2010, 08:36 AM.

          Comment


          • #6
            Originally posted by tanghz View Post
            Hi , I am using your readgenerate scripts, vert handy. However, I notice the ID of paired read is the same as the first one. e.g.
            @readNum=1_strand=+_contig=17_pos=30714265_numends=2_pel=0_rl=36_wrv=1_si=-1_il=0_r1=000000000000000000000000000000000000_r2=0000000000000000000000
            00000020020000
            GCTCTGAGTATCAGACACACCGTGGCCTCCCCAAGG
            +
            ::::::::::::::::::::::::::::::::::::
            @readNum=1_strand=+_contig=17_pos=30714265_numends=2_pel=0_rl=36_wrv=1_si=-1_il=0_r1=000000000000000000000000000000000000_r2=0000000000000000000000
            00000020020000
            GGCCAAAGGGACACCGGTTTGACAACCAACAGCGTG
            +
            ::::::::::::::::::::::::::::::::::::




            There is no reads space info. Did I do sth wrong? How do I parse the second read coordinates for later verification?
            thanks.
            Feel free to dig into the code on this one as I am not supporting that read simulator very heavily; I would be happy to incorporate a patch though,. Otherwise, I would recommend the "dwgsim" tool within http://dnaa.sf.net. The latter is something I am supporting and actively maintaining.

            Comment


            • #7
              Dear nilshomer,
              thanks for your easy-to-use ill2fastq.pl script. Since I'm working on a huge dataset and need to convert from Illumina 1.3+ to fastq I used this script and it worked well the first 20GB, then I got the following error:

              C:\path-to-file>perl ill2fastq.pl -s my_sequences > C:\path-to-file\file.fastq
              ON 0
              ON 1
              Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_on
              e> line 4.
              Unicode character 0xfffffffffffffffe is illegal at ill2fastq.pl line 383, <FH_on
              e> line 4.
              Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_on
              e> line 4.
              Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_on
              e> line 4.
              Unicode character 0xfffffffffffffffe is illegal at ill2fastq.pl line 383, <FH_tw
              o> line 4.
              Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_tw
              o> line 4.
              Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_tw
              o> line 4.
              Unicode character 0xfffffffffffffffe is illegal at ill2fastq.pl line 383, <FH_tw
              o> line 4.
              Unicode character 0xfffffffffffffffe is illegal at ill2fastq.pl line 383, <FH_tw
              o> line 4.
              Unicode character 0xffffffffffffffff is illegal at ill2fastq.pl line 383, <FH_tw
              o> line 4.
              Malformed UTF-8 character (byte 0xff) in reverse at ill2fastq.pl line 397, <FH_t
              wo> line 4.
              vw'ε\ê↔█P@▌╚*⌂┴╤§Φ╒E▬ª↔_påZ(*ijJ┼⌂■{x⌂√■∩┐▓╖¢█╒7ⁿw²mu╡┌ⁿ╧╒*¡■U¶y^╥OVΣY^µYYû┘«,v♣
              ╛╦±Qαë▌ƒ╟┐■≈+╔╬╣εαú≈▒∩¢╛∩█╢∩╦≥☺▲╒cU5╧mk£i█√≡τ±»*$$▀▌▐É`╘½(Q╣,+±☺OE╢╦╦▌p→.ªX«▓
              ╢"uL ♥mysequences_2_sequence.txt ┤}█VδJ¼σ√∙ì~∞╤ú !↨╓╙1▲º}6ù►áê‼▀]σ*∩**é╓¡
              £└r♣♫8¼═Z►╪→èóƪñ⌐╩⌂■w·■÷╧*∙»Φm╛ܲ┴?╫⌂fW│┼*║·┐╫*◄G¢▼⌂ⁿ╟*>'╣║√∙╟⌂ⁿτêΣ⌡jNé‼5▒╩^≡/¶
              ▲╫xy+→'‼k∞♣7Sk<╗║a╖ê'╓╪♂╓zjìg7δ♂⌐∞%Oε╔½╡∟╛⌐=┘♂.⌂ß↑πV^,û$9ÜZσA▓■àÖGu^▄,.s·╝α║₧Xπ¢
              ò↑yjì╜αë5₧├▒^]$Å£H■╣╞A¥%zF╙δ|{æL☻Æτ╫╖↨╥┘Kn&≈ìkë∙‼▼└‼╔ô█y╣\\╞╠^≡─↓{■τz=╗♦*:
              Died at ill2fastq.pl line 229.
              I tried to figure out, what happened here, but was only suggest that the problem lies in perls encoding of strings? (http://jeremy.zawodny.com/blog/archives/010546.html and http://perldoc.perl.org/perldiag.htm...ter-%28%25s%29)
              Perhaps someone has an idea or can provide a fast script to do the conversion fast and correct! Thanks a lot! Yours Jenzo

              Comment


              • #8
                ill2fastq.pl failed

                Hi,

                I am having difficulty using ill2fastq.pl. I have successfully used BFAST for alignment of all of my SOLiD data, but cannot get step 1 to work for my Illumina data. I am using bfast-0.6.4e

                This is what happens when I try to run the perl script (my two files are names 100247_1_sequence.txt and 100247_2_sequence.txt):

                Code:
                $ perl ill2fastq.pl -s 100247
                ON 0
                Malformed UTF-8 character (byte 0xff) in reverse at ill2fastq.pl line 395, <FH_two> line 4.
                @HWUSI-E@HWUSI-EAS570R_0028:6:1:1311:1079#0/2
                Died at ill2fastq.pl line 227
                .

                If you can help me out that would be great! Thanks in advance,

                Kelly

                Comment


                • #9
                  Googling "Malformed UTF-8 character" there seems to be something wrong with your encoding. What is your platform/OS?

                  Comment


                  • #10
                    Originally posted by nilshomer View Post
                    Googling "Malformed UTF-8 character" there seems to be something wrong with your encoding. What is your platform/OS?
                    I have a 64-bit linux running RedHat. I just tried it again using bfast-0.6.5a and the same thing happened.

                    Comment


                    • #11
                      Can you try on a different machine?

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Recent Advances in Sequencing Analysis Tools
                        by seqadmin


                        The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                        05-06-2024, 07:48 AM
                      • seqadmin
                        Essential Discoveries and Tools in Epitranscriptomics
                        by seqadmin




                        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                        04-22-2024, 07:01 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 06:57 AM
                      0 responses
                      12 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 05-06-2024, 07:17 AM
                      0 responses
                      16 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 05-02-2024, 08:06 AM
                      0 responses
                      19 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-30-2024, 12:17 PM
                      0 responses
                      24 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X