SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
For MAQ: Is there a Tool to convert sanger-format fastq file to illumina-fotmat fastq byb121 Bioinformatics 6 12-20-2013 01:26 AM
Convert illumina v1.5 fastq to sanger fastq zouzou Bioinformatics 29 05-14-2012 09:07 PM
how to transfer sanger fastQ into illumina FastQ sunsnow86 Bioinformatics 3 06-17-2011 02:21 PM
Reduce file size after Illumina FASTQ to Sanger FASTQ conversion? jjw14 Illumina/Solexa 2 06-01-2010 04:35 PM

Reply
 
Thread Tools
Old 11-09-2011, 12:15 AM   #1
Aicen
Junior Member
 
Location: Germany

Join Date: Oct 2011
Posts: 7
Default i converted illumina fastq into sanger fastq, need advice

Hello dear ngs community,
I am new to this forum but allready red many threads which helped me alot.
So i found many ways in this forum to convert illumina fastq quality scores into sanger fastq phred scores. My Data comes from sequencer which use Illumina 1.5 (thx to fastqc ). For my Diploma thesis (iam the last of my kind with Diploma ) i write a pipleline script in ruby. Therefore i use the tools bwa samtools, gatk and picard. My Prof. wants me to convert all fasq files to sanger fastq. So i read about bioruby maq and other tools but did come to the conclusion that i want to write it on my own so the user of the script wont need to install even more tools or patch bwa for my tool to correctly use it. Thats why i experimented with ASCII codes in ruby and got some result and i want to doublecheck this results with your comments.

my results:
here a exampe read:
"NACGTTATACTTGTTAGCACAATCCAAGCTAGGCTAAGAAGTTCAAACATGGTGGACGTACCCACTGATCTTTTG "

illumina 1.5 score
"BIKKGQNMLL[[[[[Y[[[[_______________YYYYYYYYYY[[[[[[Y[[YY[[[[_____________QQ"
(in numbers
66 73 75 75 71 81 78 77 76 76 91 91 91 91 91 89 91 91 91 91 95 95 95 95 95 95 95 95 95 95 95 95 95 95 95 89 89 89 89 89 89 89 89 89 89 91 91 91 91 91 91 89 91 91 89 89 91 91 91 91 95 95 95 95 95 95 95 95 95 95 95 95 95 81 81
sanger score
"#*,,(2/.--<<<<<:<<<<@@@@@@@@@@@@@@@::::::::::<<<<<<:<<::<<<<@@@@@@@@@@@@@22"
(in numbers)
35 42 44 44 40 50 47 46 45 45 60 60 60 60 60 58 60 60 60 60 64 64 64 64 64 64 64 64 64 64 64 64 64 64 64 58 58 58 58 58 58 58 58 58 58 60 60 60 60 60 60 58 60 60 58 58 60 60 60 60 64 64 64 64 64 64 64 64 64 64 64 64 64 50 50
i got the sanger score from athread in this forum who uses a commandline for converting it in bam files (couldn"t find the thread again):
samtools view -h chrYvs48_2_1_KESC1_mymod_48_2_2_KESC1_mymod.bam | perl -lane '$"="\t"; if (/^@/) {print;} else {$F[10]=~ tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/;print "@F"}' | samtools view -Sbh - > Phred_score.bam

so my question is, can i simply substract 31 to the numbers and i get a sanger quality score ?And there was something with offsets if i recognize correcly... I would converts this number again into ascii and replace them with the scores in the fasq file.
Is this the correct way or where did i mistakes.
Thank you in Advance Alex

Last edited by Aicen; 11-09-2011 at 01:32 AM.
Aicen is offline   Reply With Quote
Old 11-09-2011, 12:37 AM   #2
RDW
Member
 
Location: London

Join Date: Oct 2008
Posts: 63
Default

The current version of bwa has a '-I' option, which will read files with 1.5 encoding directly. You might want to discuss with your Prof. whether it would be more appropriate to use this rather than converting (will the fastq files be used for anything else that assumes Sanger encoding after you've run bwa?).
RDW is offline   Reply With Quote
Old 11-09-2011, 01:18 AM   #3
Aicen
Junior Member
 
Location: Germany

Join Date: Oct 2011
Posts: 7
Default bwa

my current bwa version is 5.9 so ur right. but the idea is also that the user can choose to just use my script, if he wants to, to just convert fastq files into sanger formated files.

To be honest i dont know if tools i use assume they got fastq files in sanger formation but for me it seems to get the standard score format in future, so i thought it would be a got idea to add a function which could format fastq files therefore.

Tools I use, as above described, are samtools , gatk and picard tools(mark duplicates)
Aicen is offline   Reply With Quote
Old 11-09-2011, 01:26 AM   #4
ulz_peter
Senior Member
 
Location: Graz, Austria

Join Date: Feb 2010
Posts: 219
Default

Quote:
Originally Posted by Aicen View Post


so my question is, can i simply add +31 to the numbers and i get a sanger quality score ?And there was something with offsets if i recognize correcly... I would converts this number again into ascii and replace them with the scores in the fasq file.
Is this the correct way or where did i mistakes.
Thank you in Advance Alex
I guess you mean substract 31??
Anyways, you can do that, but Ilumina 1.5 quality is using the "B" mark (i think it is ASCII 66). So this is a flag which tells you not to use that base for analysis. And AFAIK ASCII 67 is never used. So mere substraction yields somehat like Sanger, but you have to bear that in mind. (of course the B signs will have a very low Phred Score, so most downstream programs will be aware of that)

Hope that helps,
Peter
ulz_peter is offline   Reply With Quote
Old 11-09-2011, 01:31 AM   #5
Aicen
Junior Member
 
Location: Germany

Join Date: Oct 2011
Posts: 7
Default your right

Thx i will fix this in my post.
Aicen is offline   Reply With Quote
Old 08-27-2012, 06:24 AM   #6
soban
Junior Member
 
Location: sweden

Join Date: Nov 2011
Posts: 5
Default

i am a learner of NGS, we have created our own Galaxy, i need to run BWA for which i need my data set files to be converted from fastq to fastqsanger, i am not finding the way to convert it, any help will be regarded?.
soban is offline   Reply With Quote
Reply

Tags
bwa, fastq, fastq quality, illumina

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:42 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO