SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
quality control from fastq to vcf dongshenglulv Bioinformatics 3 11-05-2014 02:08 PM
Quality control of genomic resequencing data from a HiSeq gavin.oliver Genomic Resequencing 2 06-30-2013 01:48 AM
Webinar on Quality Control of NGS Data - FREE Strand SI Events / Conferences 0 09-09-2011 06:33 PM
TileQC: a system for tile-based quality control of Solexa data ScottC Illumina/Solexa 0 06-03-2008 04:54 PM
PubMed: TileQC: a system for tile-based quality control of Solexa data. Newsbot! Literature Watch 0 05-30-2008 08:21 AM

Reply
 
Thread Tools
Old 05-06-2010, 07:24 AM   #21
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Quote:
Originally Posted by NGSfan View Post
1) read in gzipped FASTQ files
2) be able to reload reports to view again
I think 1 should be fairly easy to manage - I'll look into that for the next version.

I can't see why you'd want to do 2 though. If you save a report you get an HTML version which you can open immediately and which shows exactly the same information as you had in the interactive report, or am I missing something?

Simon.
simonandrews is offline   Reply With Quote
Old 05-06-2010, 07:37 AM   #22
NGSfan
Senior Member
 
Location: Austria

Join Date: Apr 2009
Posts: 181
Default

Quote:
Originally Posted by simonandrews View Post
I think 1 should be fairly easy to manage - I'll look into that for the next version.

I can't see why you'd want to do 2 though. If you save a report you get an HTML version which you can open immediately and which shows exactly the same information as you had in the interactive report, or am I missing something?

Simon.
You know what, you're right! scratch that then - I actually never got around to opening the gzipped reports, I just assumed they were image dumps, but HTML is even better.

Thanks Simon!
NGSfan is offline   Reply With Quote
Old 05-06-2010, 09:45 AM   #23
RockChalkJayhawk
Senior Member
 
Location: Rochester, MN

Join Date: Mar 2009
Posts: 191
Default

Quote:
Originally Posted by simonandrews View Post
I can't see why you'd want to do 2 though. If you save a report you get an HTML version which you can open immediately and which shows exactly the same information as you had in the interactive report, or am I missing something?

Simon.
There is only 1 small difference between the HTML and the actual analysis. The HTML form doesn't say what type of quality values were used but they show up when you run the program.
RockChalkJayhawk is offline   Reply With Quote
Old 05-06-2010, 11:20 AM   #24
martian_bob
Member
 
Location: New York

Join Date: Feb 2010
Posts: 11
Default

This is an excellent tool, wish I'd had it months ago!
Any chance you could add functionality for colorspace-based fastq files, produced by the solid2fastq script in BWA? Right now the per base sequence content, per base GC content, per sequence GC content, and per base N content tests aren't working for my reads.
martian_bob is offline   Reply With Quote
Old 05-06-2010, 11:43 PM   #25
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Quote:
Originally Posted by RockChalkJayhawk View Post
There is only 1 small difference between the HTML and the actual analysis. The HTML form doesn't say what type of quality values were used but they show up when you run the program.
Sorry - that's a bug which I've just fixed in the development version. It will work the same in the HTML and interactive versions in the next release.
simonandrews is offline   Reply With Quote
Old 05-06-2010, 11:46 PM   #26
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Quote:
Originally Posted by martian_bob View Post
This is an excellent tool, wish I'd had it months ago!
Any chance you could add functionality for colorspace-based fastq files, produced by the solid2fastq script in BWA? Right now the per base sequence content, per base GC content, per sequence GC content, and per base N content tests aren't working for my reads.
Does the summary say that the file was recognised as colorspace or is it misreading it as conventional base calls?

Any chance you could let me have an example file which isn't working? We don't use SOLIDs here so I only have a small set of examples. Contact me off list ([email protected]) if you can help - I should only need a small fragment of the file to be able to figure out why it's not working.
simonandrews is offline   Reply With Quote
Old 05-13-2010, 04:13 AM   #27
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default FastQC v0.3 released

I've just put FastQC v0.3 up on our website. This should fix the problems with unrecognised colorspace files and adds back the type of quality score identified into the HTML report files.

There are a couple of new features. The main one is a system for identifying the source of any overrepresented sequences. The program now has a list of all of the primers and adapters commonly in use on sequencing platforms and will scan any overrepresented sequences to see if they match against these. If anyone has any other common sources of contamination they know of they can either add them to their local installation, or preferably pass them back to me so I can add them in to future versions of FastQC.

Another change is a new parameter which can be passed to the non-interactive version of the program to specify a non-default output directory for reports. This should help people who only have read-only access to the original sequence files, but still want to generate reports automatically.

Finally the program now supports the processing of gzip compressed fastq files, since some sites apparently compress their data for long term storage.

You can get the new version from:

http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

[If you don't see the new version of any page hit control+refresh to force our cache to update]
simonandrews is offline   Reply With Quote
Old 05-26-2010, 04:21 PM   #28
mard
Member
 
Location: Melbourne

Join Date: Jan 2010
Posts: 21
Default

I've been using FastQC in interactive mode and it's a really great tool.

I just tried to install and run FastQC v0.3 in non-interactive mode but got this error below (I'm running it on a Mac 10.6).

$ java -Xmx250m -classpath /Tools/FastQC/ uk.ac.bbsrc.babraham.FastQC.FastQCApplication sequence.fastq
Processing sequence.fastq
Approx 5% complete for sequence.fastq
Approx 10% complete for sequence.fastq
Approx 15% complete for sequence.fastq
Approx 20% complete for sequence.fastq
Approx 25% complete for sequence.fastq
Approx 30% complete for sequence.fastq
Approx 35% complete for sequence.fastq
Approx 40% complete for sequence.fastq
Approx 45% complete for sequence.fastq
Approx 50% complete for sequence.fastq
Approx 55% complete for sequence.fastq
Approx 60% complete for sequence.fastq
Approx 65% complete for sequence.fastq
Approx 70% complete for sequence.fastq
Approx 75% complete for sequence.fastq
Approx 80% complete for sequence.fastq
Approx 85% complete for sequence.fastq
Approx 90% complete for sequence.fastq
Approx 95% complete for sequence.fastq
Approx 100% complete for sequence.fastq
Failed to process sequence.fastq :/Tools/FastQC/Templates/.svn
mard is offline   Reply With Quote
Old 05-27-2010, 12:07 AM   #29
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Quote:
Originally Posted by mard View Post
I just tried to install and run FastQC v0.3 in non-interactive mode but got this error below (I'm running it on a Mac 10.6).

Failed to process sequence.fastq :/Tools/FastQC/Templates/.svn
It's a bug in the templating system. There's a check which should stop anything other than image files getting added to the template, but looking again I see that it's not being applied correctly.

I'll put out an update which fixes this, but the temporary work round is to go into the Templates directory and do:

rm -r .svn

..which should fix the problem in the current version.

Sorry about that.

Simon.
simonandrews is offline   Reply With Quote
Old 05-27-2010, 01:37 AM   #30
Bruins
Member
 
Location: Groningen

Join Date: Feb 2010
Posts: 78
Default

Dear Simon,

Thank you for the great software! I find the report to be very clear and informative. The docs on how to interpret the report are clear and very helpful. I will probably incorporate FastQC in my pipeline.

I had FastQC test one of my raw reads files. The file is approx. 4.5 GB (FastQC says 21107088 sequences) and all reads are 76 bases. It took FastQC about 5 minutes to process the file (I like the progress report!). Fast enough for me!

One problem did arise:
Code:
Failed to process -Dfastqc.output_dir=./fastqc/ :fastqc doesn't exist
I am pretty sure the dir does exist... it has permission drwxrwxr-x. (and checked for typo's :P).

Thanks for the hard work,

Wil

Last edited by Bruins; 05-27-2010 at 01:53 AM.
Bruins is offline   Reply With Quote
Old 05-27-2010, 01:51 AM   #31
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Quote:
Originally Posted by Bruins View Post
One problem did arise:
Code:
Failed to process -Dfastqc.output_dir=./fastqc/ :fastqc doesn't exist
I think you're constructing your command incorrectly. The error says that it's treating the -Dfastqc.output_dir... as a file to process rather than a command line argument.

The structure of the command needs to be:

java [options] [class] [files]

..and I suspect you're doing:

java [class] [options] [files]

So you need something like:

Code:
java -Xmx1024m -cp /usr/local/FastQC:$CLASSPATH -Dfastqc.output_dir=/output/dir uk.ac.bbsrc.babraham.FastQC.FastQCApplication file1.fastq file2.fastq
Hope this helps
simonandrews is offline   Reply With Quote
Old 05-27-2010, 01:55 AM   #32
Bruins
Member
 
Location: Groningen

Join Date: Feb 2010
Posts: 78
Default

Hi,

Yes, that helps. Silly of me, I did:
Code:
java -Xmx250m -cp ~/bin/FastQC:$CLASSPATH uk.ac.bbsrc.babraham.FastQC.FastQCApplication s_1_1_sequence.txt -Dfastqc.output_dir=fastqc/
that obviously doesn't work...

Wil
Bruins is offline   Reply With Quote
Old 05-27-2010, 05:50 PM   #33
mard
Member
 
Location: Melbourne

Join Date: Jan 2010
Posts: 21
Default

Quote:
Originally Posted by simonandrews View Post
It's a bug in the templating system. There's a check which should stop anything other than image files getting added to the template, but looking again I see that it's not being applied correctly.

I'll put out an update which fixes this, but the temporary work round is to go into the Templates directory and do:

rm -r .svn

..which should fix the problem in the current version.

Sorry about that.

Simon.
That fixed it. Thanks for that!
mard is offline   Reply With Quote
Old 05-28-2010, 03:11 AM   #34
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default FastQC v0.3.1

I've just put FastQC v0.3.1 up on our website. This fixes the problem with invalid template files crashing the program. It also fixes a bug with the reporting of progress in the non-interactive version of the program and adds in some missing documentation.

You can get the new version from:

http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

[If you don't see the new version of any page hit control+refresh to force our cache to update]
simonandrews is offline   Reply With Quote
Old 05-28-2010, 06:31 AM   #35
NGSfan
Senior Member
 
Location: Austria

Join Date: Apr 2009
Posts: 181
Default

this may sound whiny, but does anyone have any tips on how to make an alias or some sort of command line short cut to call FastQC ? some of us have carpel tunnel

for example convert this:

java -Xmx250m -classpath /Tools/FastQC/ uk.ac.bbsrc.babraham.FastQC.FastQCApplication

to something simple like:

fastqc
NGSfan is offline   Reply With Quote
Old 05-28-2010, 07:14 AM   #36
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Try saving the following into the FastQC install directory under the name 'fastqc':
Code:
#!/usr/bin/perl
use warnings;
use strict;
use FindBin qw($Bin);


if ($ENV{CLASSPATH}) {
        $ENV{CLASSPATH} .= $Bin;
}
else {
        $ENV{CLASSPATH} = $Bin;
}

exec "java", "-Xmx250m", "uk.ac.bbsrc.babraham.FastQC.FastQCApplication", @ARGV;
It's a bit limited in that you can't set any extra java options, but once you make this executable then you should be able to execute this file directly to launch the program.
simonandrews is offline   Reply With Quote
Old 05-28-2010, 07:23 AM   #37
Bruins
Member
 
Location: Groningen

Join Date: Feb 2010
Posts: 78
Default

For a moment there I tricked myself into recommending adding an alias to your bashrc. A handy solution for command you execute regularly. Open ~/.bashrc (mind the dot!) or create the file. I'm not sure whether this is required but my .bashrc starts with:
Code:
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi
this is to source the global defenitions.
Now add your own aliases at the bottom, like so:
Code:
alias la='ls -la'
alias ni='ssh -Y [email protected]'
etcetera.

But you have to supply additional arguments which are different each time you call FastQC. So you could create a small shell script to do the work. Let me try something, I'll get back here in a moment.

<snip>
Bruins is offline   Reply With Quote
Old 05-28-2010, 07:30 AM   #38
NGSfan
Senior Member
 
Location: Austria

Join Date: Apr 2009
Posts: 181
Default

Quote:
Originally Posted by simonandrews View Post
Try saving the following into the FastQC install directory under the name 'fastqc':
Code:
#!/usr/bin/perl
use warnings;
use strict;
use FindBin qw($Bin);


if ($ENV{CLASSPATH}) {
        $ENV{CLASSPATH} .= $Bin;
}
else {
        $ENV{CLASSPATH} = $Bin;
}

exec "java", "-Xmx250m", "uk.ac.bbsrc.babraham.FastQC.FastQCApplication", @ARGV;
It's a bit limited in that you can't set any extra java options, but once you make this executable then you should be able to execute this file directly to launch the program.
This perl script worked for me! very nice indeed! thank you very much!
NGSfan is offline   Reply With Quote
Old 05-28-2010, 07:32 AM   #39
NGSfan
Senior Member
 
Location: Austria

Join Date: Apr 2009
Posts: 181
Default

Quote:
Originally Posted by Bruins View Post
For a moment there I tricked myself into recommending adding an alias to your bashrc. A handy solution for command you execute regularly. Open ~/.bashrc (mind the dot!) or create the file. I'm not sure whether this is required but my .bashrc starts with:
Code:
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi
this is to source the global defenitions.
Now add your own aliases at the bottom, like so:
Code:
alias la='ls -la'
alias ni='ssh -Y [email protected]'
etcetera.

But you have to supply additional arguments which are different each time you call FastQC. So you could create a small shell script to do the work. Let me try something, I'll get back here in a moment.

<snip>
yes for passing on additional arguments, then perhaps sticking to the perl script script is a solution. I wouldn't bother too much with the alias approach - the perl script does the job.

Very cool way of doing it I must say.
NGSfan is offline   Reply With Quote
Old 05-28-2010, 07:41 AM   #40
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Quote:
Originally Posted by NGSfan View Post
This perl script worked for me! very nice indeed! thank you very much!
But it also had a bug in it :-)

This version should work on all systems (if they have perl installed), and will let you set both java arguments and pass in files as arguments. I may add it to the next release.

Code:
#!/usr/bin/perl
use warnings;
use strict;
use FindBin qw($Bin);


if ($ENV{CLASSPATH}) {
	$ENV{CLASSPATH} .= ":$Bin";
}
else {
	$ENV{CLASSPATH} = $Bin;
}

my @java_args = '-Xmx250m';
my @files;

foreach (@ARGV) {
  if (/^\-/) {
    push @java_args,$_;
  }
  else {
    push @files,$_;
  }
}


exec "java",@java_args, "uk.ac.bbsrc.babraham.FastQC.FastQCApplication", @files;
simonandrews is offline   Reply With Quote
Reply

Tags
fastq, quality, report

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:49 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO