![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
454 quality score, z-score,.. | nii | 454 Pyrosequencing | 4 | 10-15-2020 07:29 AM |
Questions about solexa quality score! | baohua100 | Bioinformatics | 24 | 10-11-2020 07:43 AM |
Two Version of Solexa Quality Score Formula | foolishbrat | Bioinformatics | 1 | 02-24-2009 02:59 AM |
Fastq quliaty score and MAQ output quality score | baohua100 | Bioinformatics | 1 | 02-19-2009 10:21 AM |
Questions about solexa quality score | baohua100 | Bioinformatics | 1 | 06-17-2008 09:09 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: South East Asia Join Date: Nov 2008
Posts: 44
|
![]()
Dear all,
Usually we find this kind of quality error of Solexa tag Code:
-33 31 -40 -34 -40 -40 -40 40 27 -27 -40 -40 quality refer to length 3 tags (e.g. "tca"). My question are as follows:
|
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: San Diego Join Date: May 2008
Posts: 912
|
![]()
Avergeing would be bad. Each number in the set of 4 represents the score for A,C,G, or T respectively. So the sequence for your little bit there is CTA, because in the first base, the second number is the highest, and in the second 4-some, the fourth base is the highest, and in the third, the first base is the highest.
The scores are Solexa quality scores, not exactly the same as Sanger quality score, though when the score is > 15, the two are virtually identical. There is a conversion equation around to convert the Solexa scores to Sanger scores, and an equation telling you what the error rate of a given Sanger quality score are supposed to be. A lot of alignment programs don't use the quality scores at all in alignment, though they will output the quality scores of mismatches, which helps you determine how likely it is that teh mismatch is a real polymorphism, and not an error. But read depth probably tells you more than quality scores when it comes to SNPs. |
![]() |
![]() |
![]() |
#3 | |
Member
Location: northern hemisphere Join Date: Mar 2008
Posts: 50
|
![]() Quote:
In general people would use the fastq files which are generated by the Gerald step of the GAPipeline. These files contain the base calls and an associated quality score (which is as estimation of how good the software thinks it's guess is). Most short read aligners used fastq files are their input and many (for example Maq) use this information to help find the correct alignment position. Fastq files look like this: @complete:333:89 CGCCTTCGTATGTTTATCCTGCTTATCACATACTA +complete:333:89 132057787<:9133*9,.65177;54;8)3)37/ The line following the @ contains the sequence and that following the + contains a ascii encoded number representing a quality score. There's a table here: http://www.genographia.org/portal/to...sheet.pdf/view to convert this to a "probability of error". Quality scores are also useful in SNP calling you need more bases of low quality than high quality to call a SNP with confidence. You can also filter reads based on quality score in order to discard junk reads. All in all they are quite handy but you should make sure they are correctly calibrated (and therefore accurately assigned). The prb file you've shown contains 4 quality scores for each base. So rather than just getting the probability that the correct base is right you also get probabilities for each of the other bases. So for example, you would be able to say "it was probably an A or a C, but it's very unlikely it was a G or a T". That might be useful information and some aligners are starting to take advantage of this information but it's not been fully exploited. However don't get too attached to these prb files as I believe they are set to disappear from the latest version of the GAPipeline. |
|
![]() |
![]() |
![]() |
#4 |
Junior Member
Location: canada Join Date: Oct 2020
Posts: 6
|
![]()
Do you know there are many services which claim that they are the best but I think we should check them before using it because it is always important to find the right help for your problems?
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|