Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Hello,

    I am using ur script SOLiD2Std.pl to convert 1100 genomes data to base space (fastq)

    The script doesnt convert characters after a dot.

    I want to convert it to fastq format and align it using Stampy.

    Do anyone have a script that can do the conversion properly now ?

    Comment


    • #32
      Originally posted by srividya22 View Post
      I am using ur script SOLiD2Std.pl to convert 1100 genomes data to base space (fastq)

      The script doesnt convert characters after a dot.

      I want to convert it to fastq format and align it using Stampy.

      Do anyone have a script that can do the conversion properly now ?
      After a dot, you can't assume anything about the sequence. All subsequent reads should be N in base space. Alignment should always be done in colour-space to get the most information (and least error) from the colour-space sequence.

      Comment


      • #33
        Originally posted by srividya22 View Post
        Hello,

        I am using ur script SOLiD2Std.pl to convert 1100 genomes data to base space (fastq)

        The script doesnt convert characters after a dot.

        I want to convert it to fastq format and align it using Stampy.

        Do anyone have a script that can do the conversion properly now ?
        I have already fixed some bugs of it, you can try it again.

        Comment


        • #34
          downloading SOLid2Std.pl file

          Hello BENM,

          Where can I get the recent SOLid2Std.pl. because when I googled it I could nt locate it. Can u please specify the path ?

          Comment


          • #35
            Originally posted by BENM View Post
            Hi, pliang

            Because samt's question is "Convert SOLiD fastq to Illumina fastq", Illumina FASTQ is different from Standard(Sanger) FASTQ in quality format.

            The syntax of Solexa/Illumina read format is almost identical to the FASTQ format, but the qualities are scaled differently. Given a character $sq, the following Perl code gives the Phred quality $Q:

            $Q = 10 * log(1 + 10 ** (ord($sq) - 64) / 10.0)) / log(10);

            The ASCII charactars in Solexa FASTQ means:
            Code:
            CHAR	DEC	QUALITY
            A	65	1
            B	66	2
            C	67	3
            D	68	4
            E	69	5
            F	70	6
            G	71	7
            H	72	8
            I	73	9
            J	74	10
            K	75	11
            L	76	12
            M	77	13
            N	78	14
            O	79	15
            P	80	16
            Q	81	17
            R	82	18
            S	83	19
            T	84	20
            U	85	21
            V	86	22
            W	87	23
            X	88	24
            Y	89	25
            Z	90	26
            [	91	27
            \	92	28
            ]	93	29
            ^	94	30
            _	95	31
            `	96	32
            a	97	33
            b	98	34
            c	99	35
            d	100	36
            e	101	37
            f	102	38
            g	103	39
            h	104	40
            ;	59	-5
            <	60	-4
            =	61	-3
            >	62	-2
            ?	63	-1
            @	64	0
            In contrast to Solexa FASTQ quality, the ASCII characters in standard (sanger) FASTQ, it used to denote:
            Code:
            CHAR	DEC	QUALITY
            !       0       -64
            !       1       -63
            !       2       -62
            !       3       -61
            !       4       -60
            !       5       -59
            !       6       -58
            !       7       -57
            !       8       -56
            !       9       -55
            !       10      -54
            !       11      -53
            !       12      -52
            !       13      -51
            !       14      -50
            !       15      -49
            !       16      -48
            !       17      -47
            !       18      -46
            !       19      -45
            !       20      -44
            !       21      -43
            !       22      -42
            !       23      -41
            !       24      -40
            !       25      -39
            !       26      -38
            !       27      -37
            !       28      -36
            !       29      -35
            !       30      -34
            !       31      -33
            !       32      -32
            !       33      -31
            !       34      -30
            !       35      -29
            !       36      -28
            !       37      -27
            !       38      -26
            !       39      -25
            !       40      -24
            !       41      -23
            !       42      -22
            !       43      -21
            !       44      -20
            !       45      -19
            !       46      -18
            !       47      -17
            !       48      -16
            !       49      -15
            !       50      -14
            !       51      -13
            !       52      -12
            !       53      -11
            !       54      -10
            "       55      -9
            "       56      -8
            "       57      -7
            "       58      -6
            "       59      -5
            "       60      -4
            #       61      -3
            #       62      -2
            $       63      -1
            $       64      0
            %       65      1
            %       66      2
            &       67      3
            &       68      4
            '       69      5
            (       70      6
            )       71      7
            *       72      8
            +       73      9
            +       74      10
            ,       75      11
            -       76      12
            .       77      13
            /       78      14
            0       79      15
            1       80      16
            2       81      17
            3       82      18
            4       83      19
            5       84      20
            6       85      21
            7       86      22
            8       87      23
            9       88      24
            :       89      25
            ;       90      26
            <       91      27
            =       92      28
            >       93      29
            ?       94      30
            @       95      31
            A       96      32
            B       97      33
            C       98      34
            D       99      35
            E       100     36
            F       101     37
            G       102     38
            H       103     39
            I       104     40
            J       105     41
            K       106     42
            L       107     43
            M       108     44
            N       109     45
            O       110     46
            P       111     47
            Q       112     48
            R       113     49
            S       114     50
            T       115     51
            U       116     52
            V       117     53
            W       118     54
            X       119     55
            Y       120     56
            Z       121     57
            [       122     58
            \       123     59
            ]       124     60
            ^       125     61
            _       126     62
            `       127     63
            a       128     64
            So it is easy to conver Solexa->Sanger quality, you just need to build a conversion table in PERL script, just like this:
            # Solexa->Sanger quality conversion table
            my @conv_table;
            for (-64..64) {
            $conv_table[$_+64] = chr(int(33 + 10*log(1+10**($_/10.0))/log(10)+.499));
            }

            I am trying to write a universal script for Solexa/Illumina, SOLiD/ABi, 454/Roche, 3730/Sanger,...transforming to each other format for different purpose, but I need to know your requirements, after that, I will share it to you all.

            Hope I answer your question.
            BTW I attach the SOLiD2std.pl for your question, just make a little change in SOLiD2Solexa.pl
            which format fastq file does the bowtie2 use ? standard fastq or Solexa FASTQ? Thank you!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            31 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            27 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            52 views
            0 likes
            Last Post seqadmin  
            Working...
            X