SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Pacific Biosciences



Similar Threads
Thread Thread Starter Forum Replies Last Post
converting fasta/fa files to fastq kaps General 3 05-22-2015 05:54 AM
sff files, fasta and fastq Feenix 454 Pyrosequencing 4 06-26-2014 06:43 AM
keeping heterozygote information in fasta format from .bam files pjalvaro Bioinformatics 1 06-13-2013 12:39 AM
Is it possible to convert FASTQ/FASTA files in HDF5 format? vincebaby6 Pacific Biosciences 5 08-30-2012 07:30 AM
Difference in fasta files created by fastq-dump meetasunil Introductions 0 09-27-2011 03:08 AM

Reply
 
Thread Tools
Old 01-26-2016, 04:50 AM   #1
cklopp
Member
 
Location: Toulouse France

Join Date: Sep 2009
Posts: 12
Default How do I get the RQ= information in the header of the PB fastq ou fasta files

Hi,

I've extracted fastq and fasta from bas.h5 files using bash5tools.py, but the output file did not contain the RQ= information in the header.

What is the tools I should use to collect this information?

Christophe
cklopp is offline   Reply With Quote
Old 01-27-2016, 02:34 AM   #2
rafalwoycicki
Junior Member
 
Location: Europe

Join Date: Apr 2009
Posts: 1
Default SMRT Analysis

Hi,
You should use SMRT Analysis for this. bash5tools.py is not giving this info. Ask your sequence provider to extract subreads for you using SMRT Analysis if you do not have it.

Cheers,
Rafal
rafalwoycicki is offline   Reply With Quote
Old 01-27-2016, 02:37 AM   #3
cklopp
Member
 
Location: Toulouse France

Join Date: Sep 2009
Posts: 12
Default

Thank you for your reply. How can I do this using a command line?

Cheers,
Christophe
cklopp is offline   Reply With Quote
Old 01-27-2016, 04:41 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,585
Default

@rhall from PacBio participates in the forum and he should be able to tell if this can be done on the command line. @cklopp: You may want to send Dr. Hall a PM with a link for this thread.
GenoMax is offline   Reply With Quote
Old 01-27-2016, 10:01 AM   #5
rhall
Senior Member
 
Location: San Francisco

Join Date: Aug 2012
Posts: 312
Default

You could estimate the RQ from the base quality values in the fastq file. It wont be exactly the same, but would be close enough for any use case I can think of. Alternatively you can use pbcore library to write a simple conversion tool yourself to include the information in the output header https://github.com/PacificBiosciences/pbcore.
In reality the RQ values are not well calibrated, and above ~0.75 - 0.8 do not correlate well with actual accuracy. I generally filter at 0.75 or 0.8 RQ using bash5tools.py then forget about the RQ values. For raw reads the base quality values in the fastq are also generally of little use, without going through some recalibration. Note the fastq values in ccs data are a lot more meaningful.
rhall is offline   Reply With Quote
Old 01-29-2016, 02:18 AM   #6
cklopp
Member
 
Location: Toulouse France

Join Date: Sep 2009
Posts: 12
Default

Thank you for you reply. When I compare the alignment quality versus the quality values found in the fasta ou fastq file, they seem to correlate quiet nicely.


Last edited by cklopp; 01-29-2016 at 05:17 AM.
cklopp is offline   Reply With Quote
Reply

Tags
fasta, fastq, pacbio, read quality

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:21 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO