Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Hello,

    I am using ur script SOLiD2Std.pl to convert 1100 genomes data to base space (fastq)

    The script doesnt convert characters after a dot.

    I want to convert it to fastq format and align it using Stampy.

    Do anyone have a script that can do the conversion properly now ?

    Comment


    • #32
      Originally posted by srividya22 View Post
      I am using ur script SOLiD2Std.pl to convert 1100 genomes data to base space (fastq)

      The script doesnt convert characters after a dot.

      I want to convert it to fastq format and align it using Stampy.

      Do anyone have a script that can do the conversion properly now ?
      After a dot, you can't assume anything about the sequence. All subsequent reads should be N in base space. Alignment should always be done in colour-space to get the most information (and least error) from the colour-space sequence.

      Comment


      • #33
        Originally posted by srividya22 View Post
        Hello,

        I am using ur script SOLiD2Std.pl to convert 1100 genomes data to base space (fastq)

        The script doesnt convert characters after a dot.

        I want to convert it to fastq format and align it using Stampy.

        Do anyone have a script that can do the conversion properly now ?
        I have already fixed some bugs of it, you can try it again.

        Comment


        • #34
          downloading SOLid2Std.pl file

          Hello BENM,

          Where can I get the recent SOLid2Std.pl. because when I googled it I could nt locate it. Can u please specify the path ?

          Comment


          • #35
            Originally posted by BENM View Post
            Hi, pliang

            Because samt's question is "Convert SOLiD fastq to Illumina fastq", Illumina FASTQ is different from Standard(Sanger) FASTQ in quality format.

            The syntax of Solexa/Illumina read format is almost identical to the FASTQ format, but the qualities are scaled differently. Given a character $sq, the following Perl code gives the Phred quality $Q:

            $Q = 10 * log(1 + 10 ** (ord($sq) - 64) / 10.0)) / log(10);

            The ASCII charactars in Solexa FASTQ means:
            Code:
            CHAR	DEC	QUALITY
            A	65	1
            B	66	2
            C	67	3
            D	68	4
            E	69	5
            F	70	6
            G	71	7
            H	72	8
            I	73	9
            J	74	10
            K	75	11
            L	76	12
            M	77	13
            N	78	14
            O	79	15
            P	80	16
            Q	81	17
            R	82	18
            S	83	19
            T	84	20
            U	85	21
            V	86	22
            W	87	23
            X	88	24
            Y	89	25
            Z	90	26
            [	91	27
            \	92	28
            ]	93	29
            ^	94	30
            _	95	31
            `	96	32
            a	97	33
            b	98	34
            c	99	35
            d	100	36
            e	101	37
            f	102	38
            g	103	39
            h	104	40
            ;	59	-5
            <	60	-4
            =	61	-3
            >	62	-2
            ?	63	-1
            @	64	0
            In contrast to Solexa FASTQ quality, the ASCII characters in standard (sanger) FASTQ, it used to denote:
            Code:
            CHAR	DEC	QUALITY
            !       0       -64
            !       1       -63
            !       2       -62
            !       3       -61
            !       4       -60
            !       5       -59
            !       6       -58
            !       7       -57
            !       8       -56
            !       9       -55
            !       10      -54
            !       11      -53
            !       12      -52
            !       13      -51
            !       14      -50
            !       15      -49
            !       16      -48
            !       17      -47
            !       18      -46
            !       19      -45
            !       20      -44
            !       21      -43
            !       22      -42
            !       23      -41
            !       24      -40
            !       25      -39
            !       26      -38
            !       27      -37
            !       28      -36
            !       29      -35
            !       30      -34
            !       31      -33
            !       32      -32
            !       33      -31
            !       34      -30
            !       35      -29
            !       36      -28
            !       37      -27
            !       38      -26
            !       39      -25
            !       40      -24
            !       41      -23
            !       42      -22
            !       43      -21
            !       44      -20
            !       45      -19
            !       46      -18
            !       47      -17
            !       48      -16
            !       49      -15
            !       50      -14
            !       51      -13
            !       52      -12
            !       53      -11
            !       54      -10
            "       55      -9
            "       56      -8
            "       57      -7
            "       58      -6
            "       59      -5
            "       60      -4
            #       61      -3
            #       62      -2
            $       63      -1
            $       64      0
            %       65      1
            %       66      2
            &       67      3
            &       68      4
            '       69      5
            (       70      6
            )       71      7
            *       72      8
            +       73      9
            +       74      10
            ,       75      11
            -       76      12
            .       77      13
            /       78      14
            0       79      15
            1       80      16
            2       81      17
            3       82      18
            4       83      19
            5       84      20
            6       85      21
            7       86      22
            8       87      23
            9       88      24
            :       89      25
            ;       90      26
            <       91      27
            =       92      28
            >       93      29
            ?       94      30
            @       95      31
            A       96      32
            B       97      33
            C       98      34
            D       99      35
            E       100     36
            F       101     37
            G       102     38
            H       103     39
            I       104     40
            J       105     41
            K       106     42
            L       107     43
            M       108     44
            N       109     45
            O       110     46
            P       111     47
            Q       112     48
            R       113     49
            S       114     50
            T       115     51
            U       116     52
            V       117     53
            W       118     54
            X       119     55
            Y       120     56
            Z       121     57
            [       122     58
            \       123     59
            ]       124     60
            ^       125     61
            _       126     62
            `       127     63
            a       128     64
            So it is easy to conver Solexa->Sanger quality, you just need to build a conversion table in PERL script, just like this:
            # Solexa->Sanger quality conversion table
            my @conv_table;
            for (-64..64) {
            $conv_table[$_+64] = chr(int(33 + 10*log(1+10**($_/10.0))/log(10)+.499));
            }

            I am trying to write a universal script for Solexa/Illumina, SOLiD/ABi, 454/Roche, 3730/Sanger,...transforming to each other format for different purpose, but I need to know your requirements, after that, I will share it to you all.

            Hope I answer your question.
            BTW I attach the SOLiD2std.pl for your question, just make a little change in SOLiD2Solexa.pl
            which format fastq file does the bowtie2 use ? standard fastq or Solexa FASTQ? Thank you!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin


              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
              Yesterday, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            49 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            50 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            43 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            55 views
            0 likes
            Last Post seqadmin  
            Working...
            X