SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Threshold quality score to determine the quality read of ILLUMINA reads problem edge Illumina/Solexa 20 12-14-2011 01:03 PM
bowtie command line for Illumina Hiseq 2000 with Illumina 1.5+ quality encoding files rworthi Illumina/Solexa 4 09-28-2011 11:25 AM
Quality Check for Illumina Sequencing Santosh Illumina/Solexa 1 06-30-2011 07:35 AM
Illumina sequencing error rates sixguns Illumina/Solexa 8 03-10-2011 11:09 PM
Threshold quality score to determine the quality read of ILLUMINA reads problem edge General 1 09-13-2010 02:22 PM

Reply
 
Thread Tools
Old 02-07-2012, 09:29 PM   #1
huangjun
Member
 
Location: Wuhan China

Join Date: Dec 2011
Posts: 13
Default Illumina sequencing Quality error

Hi, everyone!
Recently, i received the RNA-seq data, when i use FASTX_TOOKIT preprocessing the data, it shows that my data invalid, quality < 0; Is
there anyone encountered this problem before? thank you!
huangjun is offline   Reply With Quote
Old 02-07-2012, 10:09 PM   #2
ulz_peter
Senior Member
 
Location: Graz, Austria

Join Date: Feb 2010
Posts: 215
Default

Sounds to me as if FastX is expecting different quality encoding than you've got in your samples. Which platform does your data come from? What did you want to do with the FastX Toolkit? (I must admit I am not too experienced with that toolkit)
ulz_peter is offline   Reply With Quote
Old 02-07-2012, 11:02 PM   #3
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 550
Default

append the '-Q 33' command to fastX tools to allow them to work with quality scores that have a base of 33. This is undocumented in the command-line help for the tools, but should work with all tools that demand particular quality values.
gringer is offline   Reply With Quote
Old 02-08-2012, 03:51 AM   #4
huangjun
Member
 
Location: Wuhan China

Join Date: Dec 2011
Posts: 13
Default

my data come from Illumina Hiseq 2000 , FASTX_Toolkit can trim the low quality bases, cut off the low quality bases and mask the reads.
huangjun is offline   Reply With Quote
Old 02-08-2012, 03:55 AM   #5
huangjun
Member
 
Location: Wuhan China

Join Date: Dec 2011
Posts: 13
Default

Quote:
Originally Posted by gringer View Post
append the '-Q 33' command to fastX tools to allow them to work with quality scores that have a base of 33. This is undocumented in the command-line help for the tools, but should work with all tools that demand particular quality values.
Hi, gringer, in my reads , if there have more than 1 bases quality lower than 0 , or so ito large amount of reads, my be we can't know which parameter to append ?
huangjun is offline   Reply With Quote
Old 02-08-2012, 04:52 AM   #6
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 550
Default

Quote:
Hi, gringer, in my reads , if there have more than 1 bases quality lower than 0 , or so ito large amount of reads, my be we can't know which parameter to append ?
My experience with fastx toolkit is that it doesn't support quality values lower than 0. If you have any in there, then you have to use -Q 33, and make the adjustment manually.
gringer is offline   Reply With Quote
Old 02-11-2012, 08:32 PM   #7
huangjun
Member
 
Location: Wuhan China

Join Date: Dec 2011
Posts: 13
Default

Hi, gringer. I want to know "make the adjustment msnuslly" means? thank you !
huangjun is offline   Reply With Quote
Old 02-11-2012, 09:52 PM   #8
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 550
Default

Here's the table from Wikipedia:
Code:
  SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.....................................................
  ..........................XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX......................
  ...............................IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII......................
  .................................JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ......................
  LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL....................................................
  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
  |                         |    |        |                              |                     |
 33                        59   64       73                            104                   126

 S - Sanger        Phred+33,  raw reads typically (0, 40)
 X - Solexa        Solexa+64, raw reads typically (-5, 40)
 I - Illumina 1.3+ Phred+64,  raw reads typically (0, 40)
 J - Illumina 1.5+ Phred+64,  raw reads typically (3, 40)
    with 0=unused, 1=unused, 2=Read Segment Quality Control Indicator (bold) 
    (Note: See discussion above).
 L - Illumina 1.8+ Phred+33,  raw reads typically (0, 41)
The only negative reads here are the old Solexa reads. These start at base 59, so you could either use 59 as a base for the fastx-toolkit and add 5 to whatever you use for quality thresholds (e.g. mask at q25, rather than q20), or use 33 as a base for everything and add the difference (e.g. mask at q51, rather than q20). This mental arithmetic is what I mean by doing it manually.
gringer is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:39 AM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.