SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
quality control from fastq to vcf dongshenglulv Bioinformatics 3 11-05-2014 03:08 PM
Quality Control of Solexa aquleaf Illumina/Solexa 3 04-07-2011 06:15 AM
Quality Control question dicty Bioinformatics 1 02-10-2011 01:58 PM
fast short read assembler w/ quality scores blindtiger454 De novo discovery 0 11-13-2010 08:26 PM
Quality Control and Quality Values agc Bioinformatics 4 08-24-2010 12:44 AM

Reply
 
Thread Tools
Old 11-16-2011, 08:25 AM   #1
desmo
Member
 
Location: Pavia

Join Date: Nov 2011
Posts: 25
Post Short-read quality control software.

Hi everyone,
I would like to know which are the most useful softwares to control the quality of short reads generated by different kinds of platform.
I've just used FastQC but it's just a visual tool.
Do anyone know similar softwares that make similar analisys, provide some informations about bad reads and filter them?
Thanks
desmo is offline   Reply With Quote
Old 11-16-2011, 08:56 AM   #2
kga1978
Senior Member
 
Location: Boston, MA

Join Date: Nov 2010
Posts: 100
Default

I have recently started using a combination of cutadapt and prinseq that seems to be working pretty well. Here's an example command that will trim all TruSeq adapters and low quality bases

gunzip *.gz -c | cutadapt -O 6 2> reads.cutadap_log.txt -a AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTTCTGCTTG -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACTTAGGCATCTCGTATGCCGTCTTCTGCTTG -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCAATCTCGTATGCCGTCTTCTGCTTG -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTG -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATATCTCGTATGCCGTCTTCTGCTTG -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCATCTCGTATGCCGTCTTCTGCTTG -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACACTTGAATCTCGTATGCCGTCTTCTGCTTG -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACGATCAGATCTCGTATGCCGTCTTCTGCTTG -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATGCCGTCTTCTGCTTG -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCTACATCTCGTATGCCGTCTTCTGCTTG -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTTGTAATCTCGTATGCCGTCTTCTGCTTG -a CTTCACCGTGCCAGACTAGAGTCAAGCTCAACAGGGTCTTCTTTCCCCGCTG -a GGATGAACGAGATTCCCACTGTCCCTACCTACTATCCAGCGAAACCACAGCC -a CTCCCTTTCGATCGGCCGAGGGCAACGGAGGCCATCGCCCGTCCCTTCGGAA -a CGAGATTCCCACTGTCCCTACCTACTATCCAGCGAAACCACAGCCAAGGGAA -a CCACTCTCGACTGCCGGCGACGGCCGGGTATGGGCCCGACGCTCCAGCGCCA -a TGGAAGTCGGAATCCGCTAAGGAGTGTGTAACAACTCACCTGCCGAATCAAC -a CCTATACCCAGGTCGGACGACCGATTTGCACGTCAGGACCGCTACGGACCTC -a CACGAGCGCACGTGTTAGGACCCGAAAGATGGTGAACTATGCCTGGGCAGGG -a GTCGGAATCCGCTAAGGAGTGTGTAACAACTCACCTGCCGAATCAACTAGCC -a CTCCCGTCCACTCTCGACTGCCGGCGACGGCCGGGTATGGGCCCGACGCTCC -a CGCAGGTTCAGACATTTGGTGTATGTGCTTGGCTGAGGAGCCAATGGGGCGA -a GAACGAGATTCCCACTGTCCCTACCTACTATCCAGCGAAACCACAGCCAAGG -a CAGAAGGGCAAAAGCTCGCTTGATCTTGATTTTCAGTACGAATACAGACCGT -a TTTCGATCGGCCGAGGGCAACGGAGGCCATCGCCCGTCCCTTCGGAACGGCG - | prinseq -fastq stdin -out_good stdout -log reads.prinseq_log.txt -min_len 20 -ns_max_n 4 -min_gc 10 -max_gc 90 -min_qual_mean 18 -trim_qual_left 10 -trim_qual_right 10 | gzip -9 > reads.trimmed.fastq.gz
kga1978 is offline   Reply With Quote
Old 11-17-2011, 12:42 AM   #3
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by desmo View Post
Hi everyone,
I would like to know which are the most useful softwares to control the quality of short reads generated by different kinds of platform.
I've just used FastQC but it's just a visual tool.
Do anyone know similar softwares that make similar analisys, provide some informations about bad reads and filter them?
Thanks
I would be wary about automatically filtering anything which looked odd in a dataset. Some filtering (quality and adapter trimming) is uncontentious and programs like cutadapt seem to do a good enough job for that. Filtering on other measures within your library without actually looking at the data and putting this in the context of the experiment runs the risk of removing interesting biological effects.

We've always said that the tests in FastQC are not intended to tell you whether your data is good or bad, they're there to tell you which aspects of your data you need to consider more closely, or bear in mind when interpreting the results of downstream analysis. We can provide examples of perfectly good sequence data which can fail any of the tests FastQC does.
simonandrews is offline   Reply With Quote
Old 11-25-2011, 01:06 AM   #4
desmo
Member
 
Location: Pavia

Join Date: Nov 2011
Posts: 25
Default

Thanks. I've not yet decided what strategy is better for my work but yours suggestions were very useful
desmo is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:06 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO