SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   FastQC: A quality control application for FastQ data (http://seqanswers.com/forums/showthread.php?t=4846)

simonandrews 04-26-2010 03:33 AM

FastQC: A quality control application for FastQ data
 
I have just put up on our website the first release of an application we have developed to perform QC checks on high throughput sequence data.

FastQC runs a series of tests and will flag up and potential problems with your data.

The program can either be run as an interactive GUI application or it can run in an unattended offline mode where it generates HTML versions of its reports.

We've been using this on some of our data for a few weeks and have found it really useful for looking at aspects of your data which the standard instrument QC checks may miss.

FastQC is free software under the GPLv3. You can download it from:

http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

..where there is also a sample report which you can look at.

[Please note that the rather aggressive BBSRC cache may show you old versions of some pages - if you can't see FastQC on some of our pages please press shift+refresh in your browser to force an update which bypasses the cache].

We are keen to get feedback from other sites - in particular we'd like to know:
  • Are there other tests you think would be useful
  • Are the criteria we're using to warn about potentially bad data any good (and can you suggest improvements)

I hope this proves useful to some people here.

Simon.

shurjo 04-26-2010 07:31 PM

Hi Simon,

I would really like to use FastQC for my project but am getting the following error message when I try to run it non-interactively on our Linux cluster:

$ java -Xmx250m -cp ~/bin/fastqc/FastQC

uk.ac.bbsrc.babraham.FastQC.FastQCApplication testFastQC.fastq
Exception in thread "main" java.awt.HeadlessException:
No X11 DISPLAY variable was set, but this program performed an operation which requires it.
at java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:159)
at java.awt.Window.<init>(Window.java:431)
at java.awt.Frame.<init>(Frame.java:403)
at java.awt.Frame.<init>(Frame.java:368)
at javax.swing.JFrame.<init>(JFrame.java:158)
at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.<init>(FastQCApplication.java:197)
at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:63)

These are the details for our java installation:

[[email protected] FastQC]$ java -version
java version "1.6.0_17"
Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)

Any pointers?

Thanks,

Shurjo

ohofmann 04-26-2010 07:45 PM

Your cluster setup is blocking X11 / the graphical interface (or you are not exporting the display to your local machine if it does).

simonandrews 04-26-2010 11:28 PM

You might want to try setting a DISPLAY environment variable (export DISPLAY=:0.0) even if you're running on a headless system. If you specify a filename when launching FastQC then no windows should open, but since the program uses some swing classes behind the scenes then java might be getting itself confused.

I'll try to replicate this on one of our servers and see if I can trigger the same problem.

lletourn 04-27-2010 06:39 AM

I also tried putting -Djava.awt.headless=true with the java command but it didn't work. The export DISPLAY works though.

simonandrews 04-27-2010 07:02 AM

OK, it turns out there are two problems here.

One, as lletourn pointed out is that if you're running a headless server you need to add -Djava.awt.headless=true

There is also a change I need to make internally to FastQC to stop it trying to set up a graphical window (which it never displays) if it's running non-interactively.

I'll try to get an update out tomorrow with a fix for the internal problem and better instructions.

cadlag 04-27-2010 12:43 PM

the error message


C:\download\FastQC>java -Xmx250m -classpath . uk.ac.bbsrc.babraham.FastQC.FastQC
Application

Exception in thread "Thread-5" java.lang.IllegalArgumentException: No knonwn enc odings with chars < 33
at uk.ac.bbsrc.babraham.FastQC.Sequence.PhredEncoding.getFastQEncodingOf
fset(PhredEncoding.java:30)
at uk.ac.bbsrc.babraham.FastQC.Modules.PerBaseQualityScores.getPercentag
es(PerBaseQualityScores.java:65)
at uk.ac.bbsrc.babraham.FastQC.Modules.PerBaseQualityScores.getResultsPa
nel(PerBaseQualityScores.java:56)
at uk.ac.bbsrc.babraham.FastQC.Results.ResultsPanel.analysisComplete(Res
ultsPanel.java:117)
at uk.ac.bbsrc.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunne
r.java:84)
at java.lang.Thread.run(Unknown Source)

shurjo 04-27-2010 04:11 PM

Quote:

Originally Posted by simonandrews (Post 17607)
OK, it turns out there are two problems here.

One, as lletourn pointed out is that if you're running a headless server you need to add -Djava.awt.headless=true

There is also a change I need to make internally to FastQC to stop it trying to set up a graphical window (which it never displays) if it's running non-interactively.

I'll try to get an update out tomorrow with a fix for the internal problem and better instructions.

Thanks! I will wait eagerly.

simonandrews 04-28-2010 12:57 AM

Quote:

Originally Posted by cadlag (Post 17634)
the error message
java.lang.IllegalArgumentException: No knonwn encodings with chars < 33

That's interesting. What is the source for the FastQ file which failed? According to wikipedia (so it must be true), there aren't any quality encoding variants which use characters lower than 33.

If you'd be happy to let me have a copy of the FastQ file which is failing I'll take a look - contact me off list ([email protected]). If not I'll add some more debugging to the next release so it will still fail but might give more of a clue as to the parameters it's seeing.

simonandrews 04-28-2010 02:32 AM

FastQC v0.1.1 is now up on our website. This contains a fix for the problem with headless operation. You should just be able to run the program as described in the original install document (no need to add extra property settings) and it should now work as long as you specify the file(s) to process on the command line.

[If you can't see the update on our site please press shift+refresh in your browser to force it to update the cache]

Please let me know if this fixes things.

Thomas Doktor 04-28-2010 05:07 AM

Looks great Simon, I'm suddenly aware of the skewed base-composition of some of our runs in the beginning of the sequences, but they level off and become basically uniform at 25% around base 15 and onwards. The runs are otherwise fine, has anyone seen similar results/artefacts?

simonandrews 04-28-2010 05:11 AM

We've seen similarly odd biases, both in sequence composition and unusually low qualities at the start of some runs, and I know of other groups who've also seen this. Normally it's only a minor effect, but in samples which are of generally poorer quality it can be really noticeable.

I don't know of an explanation for this. If it affects qualities as well as base calls I'd guess it would be a bias in the sequencing chemistry or the cluster calling?

Thomas Doktor 04-28-2010 05:17 AM

The qualities look fine so it's not an issue of bad base calling. I think you could be right that the cluster calling and/or sequencing chemistry could explain some of it. Could perhaps explain why certain sequences in the genome are less likely to be sequenced, we often see peaks and valleys in exons in our RNA-seq runs which are most likely explained by sequencing artefacts.

NGSfan 04-29-2010 08:09 AM

Hi! Thanks for sharing this program - I like the idea of getting a summary look at the FASTQ's

I tried from 3 different linux boxes, but get the same error :


java -Xmx250m -classpath . uk.ac.bbsrc.babraham.FastQC.FastQCApplication

Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
Caused by: java.lang.ClassNotFoundException: uk.ac.bbsrc.babraham.FastQC.FastQCApplication
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
. Program will exit.in class: uk.ac.bbsrc.babraham.FastQC.FastQCApplication

is there something obvious I'm missing here? sorry - it's been ages since I've programmed in Java.

shurjo 04-29-2010 09:54 AM

Quote:

Originally Posted by simonandrews (Post 17666)
FastQC v0.1.1 is now up on our website. This contains a fix for the problem with headless operation. You should just be able to run the program as described in the original install document (no need to add extra property settings) and it should now work as long as you specify the file(s) to process on the command line.

[If you can't see the update on our site please press shift+refresh in your browser to force it to update the cache]

Please let me know if this fixes things.

Hi Simon,

The download site still shows FastQC v0.1 even after clearing my cache. Am I missing something here?

Thanks,

Shurjo

simonandrews 04-29-2010 11:32 PM

Quote:

Originally Posted by shurjo (Post 17765)
The download site still shows FastQC v0.1 even after clearing my cache. Am I missing something here?

Sorry, the download page copied over to the wrong folder. Try it again now and the links should be updated.

simonandrews 04-29-2010 11:36 PM

Quote:

Originally Posted by NGSfan (Post 17750)
Hi! Thanks for sharing this program - I like the idea of getting a summary look at the FASTQ's

I tried from 3 different linux boxes, but get the same error :


java -Xmx250m -classpath . uk.ac.bbsrc.babraham.FastQC.FastQCApplication

Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication

If you're running the program from outside the install directory you need to set the classpath to the directory which contains the FastQC installation, eg:

java -Xmx250m -classpath /usr/local/FastQC uk.ac.bbsrc.babraham.FastQC.FastQCApplication

If you have a non-standard classpath on your machine (which you probably won't by default) you may need to append your existing classpath:

java -Xmx250m -classpath /usr/local/FastQC:$CLASSPATH uk.ac.bbsrc.babraham.FastQC.FastQCApplication

Hopefully that should get you up and running.

NGSfan 04-30-2010 04:07 AM

Quote:

Originally Posted by simonandrews (Post 17808)
If you're running the program from outside the install directory you need to set the classpath to the directory which contains the FastQC installation, eg:

java -Xmx250m -classpath /usr/local/FastQC uk.ac.bbsrc.babraham.FastQC.FastQCApplication

If you have a non-standard classpath on your machine (which you probably won't by default) you may need to append your existing classpath:

java -Xmx250m -classpath /usr/local/FastQC:$CLASSPATH uk.ac.bbsrc.babraham.FastQC.FastQCApplication

Hopefully that should get you up and running.



Thank you Simon, the second command worked. I guess the classpath we have is non-standard!

FastQC works great! Very helpful tool - a great way to get a quick look at the data.

simonandrews 05-06-2010 05:01 AM

I've just put FastQC v0.2 up on our website. The main changes are:
  1. Some basic colorspace support
  2. An option to create unzipped reports directly as well as the original zip files
  3. An option to customise the HTML reports to add your own site branding
  4. Adding an easily parsed summary file to allow pipelines to quickly flag potential problems

There are also numerous smaller fixes to make things work more smoothly.

You can get the new version from:

http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

[If you don't see the new version of any page hit control+refresh to force our cache to update]

NGSfan 05-06-2010 07:17 AM

Quote:

Originally Posted by simonandrews (Post 18115)
I've just put FastQC v0.2 up on our website. The main changes are:
  1. Some basic colorspace support
  2. An option to create unzipped reports directly as well as the original zip files
  3. An option to customise the HTML reports to add your own site branding
  4. Adding an easily parsed summary file to allow pipelines to quickly flag potential problems

There are also numerous smaller fixes to make things work more smoothly.

You can get the new version from:

http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

[If you don't see the new version of any page hit control+refresh to force our cache to update]

thanks for the update simon! those extra features are really good.

Some feature requests: could you add ability to

1) read in gzipped FASTQ files
2) be able to reload reports to view again

Thanks for this package - nice and convenient


All times are GMT -8. The time now is 10:04 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.