SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
quality control from fastq to vcf dongshenglulv Bioinformatics 3 11-05-2014 02:08 PM
Quality control of genomic resequencing data from a HiSeq gavin.oliver Genomic Resequencing 2 06-30-2013 01:48 AM
Webinar on Quality Control of NGS Data - FREE Strand SI Events / Conferences 0 09-09-2011 06:33 PM
TileQC: a system for tile-based quality control of Solexa data ScottC Illumina/Solexa 0 06-03-2008 04:54 PM
PubMed: TileQC: a system for tile-based quality control of Solexa data. Newsbot! Literature Watch 0 05-30-2008 08:21 AM

Reply
 
Thread Tools
Old 04-26-2010, 03:33 AM   #1
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Lightbulb FastQC: A quality control application for FastQ data

I have just put up on our website the first release of an application we have developed to perform QC checks on high throughput sequence data.

FastQC runs a series of tests and will flag up and potential problems with your data.

The program can either be run as an interactive GUI application or it can run in an unattended offline mode where it generates HTML versions of its reports.

We've been using this on some of our data for a few weeks and have found it really useful for looking at aspects of your data which the standard instrument QC checks may miss.

FastQC is free software under the GPLv3. You can download it from:

http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

..where there is also a sample report which you can look at.

[Please note that the rather aggressive BBSRC cache may show you old versions of some pages - if you can't see FastQC on some of our pages please press shift+refresh in your browser to force an update which bypasses the cache].

We are keen to get feedback from other sites - in particular we'd like to know:
  • Are there other tests you think would be useful
  • Are the criteria we're using to warn about potentially bad data any good (and can you suggest improvements)

I hope this proves useful to some people here.

Simon.
simonandrews is offline   Reply With Quote
Old 04-26-2010, 07:31 PM   #2
shurjo
Senior Member
 
Location: Rockville, MD

Join Date: Jan 2009
Posts: 125
Default

Hi Simon,

I would really like to use FastQC for my project but am getting the following error message when I try to run it non-interactively on our Linux cluster:

$ java -Xmx250m -cp ~/bin/fastqc/FastQC

uk.ac.bbsrc.babraham.FastQC.FastQCApplication testFastQC.fastq
Exception in thread "main" java.awt.HeadlessException:
No X11 DISPLAY variable was set, but this program performed an operation which requires it.
at java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:159)
at java.awt.Window.<init>(Window.java:431)
at java.awt.Frame.<init>(Frame.java:403)
at java.awt.Frame.<init>(Frame.java:368)
at javax.swing.JFrame.<init>(JFrame.java:158)
at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.<init>(FastQCApplication.java:197)
at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:63)

These are the details for our java installation:

[[email protected] FastQC]$ java -version
java version "1.6.0_17"
Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)

Any pointers?

Thanks,

Shurjo
shurjo is offline   Reply With Quote
Old 04-26-2010, 07:45 PM   #3
ohofmann
Member
 
Location: Melbourne, Australia

Join Date: Jan 2009
Posts: 37
Default

Your cluster setup is blocking X11 / the graphical interface (or you are not exporting the display to your local machine if it does).
ohofmann is offline   Reply With Quote
Old 04-26-2010, 11:28 PM   #4
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

You might want to try setting a DISPLAY environment variable (export DISPLAY=:0.0) even if you're running on a headless system. If you specify a filename when launching FastQC then no windows should open, but since the program uses some swing classes behind the scenes then java might be getting itself confused.

I'll try to replicate this on one of our servers and see if I can trigger the same problem.
simonandrews is offline   Reply With Quote
Old 04-27-2010, 06:39 AM   #5
lletourn
Member
 
Location: Montreal

Join Date: Oct 2009
Posts: 63
Default

I also tried putting -Djava.awt.headless=true with the java command but it didn't work. The export DISPLAY works though.

Last edited by lletourn; 04-27-2010 at 06:47 AM.
lletourn is offline   Reply With Quote
Old 04-27-2010, 07:02 AM   #6
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

OK, it turns out there are two problems here.

One, as lletourn pointed out is that if you're running a headless server you need to add -Djava.awt.headless=true

There is also a change I need to make internally to FastQC to stop it trying to set up a graphical window (which it never displays) if it's running non-interactively.

I'll try to get an update out tomorrow with a fix for the internal problem and better instructions.
simonandrews is offline   Reply With Quote
Old 04-27-2010, 12:43 PM   #7
cadlag
Junior Member
 
Location: New Haven, CT

Join Date: Apr 2010
Posts: 1
Default

the error message


C:\download\FastQC>java -Xmx250m -classpath . uk.ac.bbsrc.babraham.FastQC.FastQC
Application

Exception in thread "Thread-5" java.lang.IllegalArgumentException: No knonwn enc odings with chars < 33
at uk.ac.bbsrc.babraham.FastQC.Sequence.PhredEncoding.getFastQEncodingOf
fset(PhredEncoding.java:30)
at uk.ac.bbsrc.babraham.FastQC.Modules.PerBaseQualityScores.getPercentag
es(PerBaseQualityScores.java:65)
at uk.ac.bbsrc.babraham.FastQC.Modules.PerBaseQualityScores.getResultsPa
nel(PerBaseQualityScores.java:56)
at uk.ac.bbsrc.babraham.FastQC.Results.ResultsPanel.analysisComplete(Res
ultsPanel.java:117)
at uk.ac.bbsrc.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunne
r.java:84)
at java.lang.Thread.run(Unknown Source)
cadlag is offline   Reply With Quote
Old 04-27-2010, 04:11 PM   #8
shurjo
Senior Member
 
Location: Rockville, MD

Join Date: Jan 2009
Posts: 125
Default

Quote:
Originally Posted by simonandrews View Post
OK, it turns out there are two problems here.

One, as lletourn pointed out is that if you're running a headless server you need to add -Djava.awt.headless=true

There is also a change I need to make internally to FastQC to stop it trying to set up a graphical window (which it never displays) if it's running non-interactively.

I'll try to get an update out tomorrow with a fix for the internal problem and better instructions.
Thanks! I will wait eagerly.
shurjo is offline   Reply With Quote
Old 04-28-2010, 12:57 AM   #9
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Quote:
Originally Posted by cadlag View Post
the error message
java.lang.IllegalArgumentException: No knonwn encodings with chars < 33
That's interesting. What is the source for the FastQ file which failed? According to wikipedia (so it must be true), there aren't any quality encoding variants which use characters lower than 33.

If you'd be happy to let me have a copy of the FastQ file which is failing I'll take a look - contact me off list ([email protected]). If not I'll add some more debugging to the next release so it will still fail but might give more of a clue as to the parameters it's seeing.
simonandrews is offline   Reply With Quote
Old 04-28-2010, 02:32 AM   #10
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

FastQC v0.1.1 is now up on our website. This contains a fix for the problem with headless operation. You should just be able to run the program as described in the original install document (no need to add extra property settings) and it should now work as long as you specify the file(s) to process on the command line.

[If you can't see the update on our site please press shift+refresh in your browser to force it to update the cache]

Please let me know if this fixes things.
simonandrews is offline   Reply With Quote
Old 04-28-2010, 05:07 AM   #11
Thomas Doktor
Senior Member
 
Location: University of Southern Denmark (SDU), Denmark

Join Date: Apr 2009
Posts: 105
Default

Looks great Simon, I'm suddenly aware of the skewed base-composition of some of our runs in the beginning of the sequences, but they level off and become basically uniform at 25% around base 15 and onwards. The runs are otherwise fine, has anyone seen similar results/artefacts?
Thomas Doktor is offline   Reply With Quote
Old 04-28-2010, 05:11 AM   #12
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

We've seen similarly odd biases, both in sequence composition and unusually low qualities at the start of some runs, and I know of other groups who've also seen this. Normally it's only a minor effect, but in samples which are of generally poorer quality it can be really noticeable.

I don't know of an explanation for this. If it affects qualities as well as base calls I'd guess it would be a bias in the sequencing chemistry or the cluster calling?
simonandrews is offline   Reply With Quote
Old 04-28-2010, 05:17 AM   #13
Thomas Doktor
Senior Member
 
Location: University of Southern Denmark (SDU), Denmark

Join Date: Apr 2009
Posts: 105
Default

The qualities look fine so it's not an issue of bad base calling. I think you could be right that the cluster calling and/or sequencing chemistry could explain some of it. Could perhaps explain why certain sequences in the genome are less likely to be sequenced, we often see peaks and valleys in exons in our RNA-seq runs which are most likely explained by sequencing artefacts.
Thomas Doktor is offline   Reply With Quote
Old 04-29-2010, 08:09 AM   #14
NGSfan
Senior Member
 
Location: Austria

Join Date: Apr 2009
Posts: 181
Default

Hi! Thanks for sharing this program - I like the idea of getting a summary look at the FASTQ's

I tried from 3 different linux boxes, but get the same error :


java -Xmx250m -classpath . uk.ac.bbsrc.babraham.FastQC.FastQCApplication

Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
Caused by: java.lang.ClassNotFoundException: uk.ac.bbsrc.babraham.FastQC.FastQCApplication
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
. Program will exit.in class: uk.ac.bbsrc.babraham.FastQC.FastQCApplication

is there something obvious I'm missing here? sorry - it's been ages since I've programmed in Java.
NGSfan is offline   Reply With Quote
Old 04-29-2010, 09:54 AM   #15
shurjo
Senior Member
 
Location: Rockville, MD

Join Date: Jan 2009
Posts: 125
Default

Quote:
Originally Posted by simonandrews View Post
FastQC v0.1.1 is now up on our website. This contains a fix for the problem with headless operation. You should just be able to run the program as described in the original install document (no need to add extra property settings) and it should now work as long as you specify the file(s) to process on the command line.

[If you can't see the update on our site please press shift+refresh in your browser to force it to update the cache]

Please let me know if this fixes things.
Hi Simon,

The download site still shows FastQC v0.1 even after clearing my cache. Am I missing something here?

Thanks,

Shurjo
shurjo is offline   Reply With Quote
Old 04-29-2010, 11:32 PM   #16
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Quote:
Originally Posted by shurjo View Post
The download site still shows FastQC v0.1 even after clearing my cache. Am I missing something here?
Sorry, the download page copied over to the wrong folder. Try it again now and the links should be updated.
simonandrews is offline   Reply With Quote
Old 04-29-2010, 11:36 PM   #17
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Quote:
Originally Posted by NGSfan View Post
Hi! Thanks for sharing this program - I like the idea of getting a summary look at the FASTQ's

I tried from 3 different linux boxes, but get the same error :


java -Xmx250m -classpath . uk.ac.bbsrc.babraham.FastQC.FastQCApplication

Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
If you're running the program from outside the install directory you need to set the classpath to the directory which contains the FastQC installation, eg:

java -Xmx250m -classpath /usr/local/FastQC uk.ac.bbsrc.babraham.FastQC.FastQCApplication

If you have a non-standard classpath on your machine (which you probably won't by default) you may need to append your existing classpath:

java -Xmx250m -classpath /usr/local/FastQC:$CLASSPATH uk.ac.bbsrc.babraham.FastQC.FastQCApplication

Hopefully that should get you up and running.
simonandrews is offline   Reply With Quote
Old 04-30-2010, 04:07 AM   #18
NGSfan
Senior Member
 
Location: Austria

Join Date: Apr 2009
Posts: 181
Default

Quote:
Originally Posted by simonandrews View Post
If you're running the program from outside the install directory you need to set the classpath to the directory which contains the FastQC installation, eg:

java -Xmx250m -classpath /usr/local/FastQC uk.ac.bbsrc.babraham.FastQC.FastQCApplication

If you have a non-standard classpath on your machine (which you probably won't by default) you may need to append your existing classpath:

java -Xmx250m -classpath /usr/local/FastQC:$CLASSPATH uk.ac.bbsrc.babraham.FastQC.FastQCApplication

Hopefully that should get you up and running.


Thank you Simon, the second command worked. I guess the classpath we have is non-standard!

FastQC works great! Very helpful tool - a great way to get a quick look at the data.
NGSfan is offline   Reply With Quote
Old 05-06-2010, 05:01 AM   #19
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

I've just put FastQC v0.2 up on our website. The main changes are:
  1. Some basic colorspace support
  2. An option to create unzipped reports directly as well as the original zip files
  3. An option to customise the HTML reports to add your own site branding
  4. Adding an easily parsed summary file to allow pipelines to quickly flag potential problems

There are also numerous smaller fixes to make things work more smoothly.

You can get the new version from:

http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

[If you don't see the new version of any page hit control+refresh to force our cache to update]
simonandrews is offline   Reply With Quote
Old 05-06-2010, 07:17 AM   #20
NGSfan
Senior Member
 
Location: Austria

Join Date: Apr 2009
Posts: 181
Default

Quote:
Originally Posted by simonandrews View Post
I've just put FastQC v0.2 up on our website. The main changes are:
  1. Some basic colorspace support
  2. An option to create unzipped reports directly as well as the original zip files
  3. An option to customise the HTML reports to add your own site branding
  4. Adding an easily parsed summary file to allow pipelines to quickly flag potential problems

There are also numerous smaller fixes to make things work more smoothly.

You can get the new version from:

http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

[If you don't see the new version of any page hit control+refresh to force our cache to update]
thanks for the update simon! those extra features are really good.

Some feature requests: could you add ability to

1) read in gzipped FASTQ files
2) be able to reload reports to view again

Thanks for this package - nice and convenient
NGSfan is offline   Reply With Quote
Reply

Tags
fastq, quality, report

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:20 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO