Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • non-interactive mode works

    Hi, great thanks a lot! The non-interactive mode works fine and produces an .html file with the results. That's all I need.
    Thanks,
    Chris

    Comment


    • Hi,

      I haven't been following this thread in the meantime, but I previous had this issue with fastqc - the header format was causing it to think there were more tiles than sensible.

      Does anyone here know if this format is something which is actually generated by an Illumina sequencer, or is it something an individual or maybe the ENA have done to the file?
      Now I've come across the same issue with a completely different dataset, so I thought I'd let you know! Seems to be the same problem as the devel version that fixed it last time also works on this data. The dataset is here: http://www.ncbi.nlm.nih.gov/geo/quer...acc=GSM1004802

      Comment


      • Originally posted by liz_is View Post
        Hi,

        I haven't been following this thread in the meantime, but I previous had this issue with fastqc - the header format was causing it to think there were more tiles than sensible.



        Now I've come across the same issue with a completely different dataset, so I thought I'd let you know! Seems to be the same problem as the devel version that fixed it last time also works on this data. The dataset is here: http://www.ncbi.nlm.nih.gov/geo/quer...acc=GSM1004802
        Thanks for reporting this. I think this is the same issue as before and is fixed in the development version. We're really close to being able to release an update to finally address this. There are two outstanding bugs which we want to close:
        1. Some headers are mis-recognised as tile identifiers and suck up all available memory
        2. A non thread-safe counter can cause the program to hang when processing multiple files at once (the processing is actually complete but the program doesn't recognise that)


        Hopefully there will be a new release next week to sort these out.

        Comment


        • Anyone have any publicly available non-html output files?

          I'm looking for any FastQC results output tables. Any help pointing me in the right direction would be really appreciated.

          Comment


          • Originally posted by yoyoming1001 View Post
            I'm looking for any FastQC results output tables. Any help pointing me in the right direction would be really appreciated.
            Like I said in the other thread you can download some public data from NCBI SRA and create your own.

            Comment


            • yes, you may download some fastq files and then try Fastqc on them

              Comment


              • Originally posted by yoyoming1001 View Post
                I'm looking for any FastQC results output tables. Any help pointing me in the right direction would be really appreciated.
                All of the example reports shown on the fastqc project page also have the accompanying text output available (just not linked from the html page), so you're welcome to download those.

                www.bioinformatics.babraham.ac.uk/projects/fastqc/bad_sequence_fastqc.zip
                www.bioinformatics.babraham.ac.uk/projects/fastqc/good_sequence_short_fastqc.zip
                www.bioinformatics.babraham.ac.uk/projects/fastqc/small_rna_fastqc.zip
                www.bioinformatics.babraham.ac.uk/projects/fastqc/RNA-Seq_fastqc.zip
                www.bioinformatics.babraham.ac.uk/projects/fastqc/RRBS_fastqc.zip
                www.bioinformatics.babraham.ac.uk/projects/fastqc/pacbio_srr075104_fastqc.zip
                www.bioinformatics.babraham.ac.uk/projects/fastqc/454_SRR073599_fastqc.zip

                Comment


                • FastQC v0.11.3 has just been released. It fixes a few annoying bugs which have been mentioned on here before, and adds some support for processing folders of nanopore sequencing reads in HDF5 format.

                  http://www.bioinformatics.babraham.a...ojects/fastqc/

                  Comment


                  • Dear Simon,
                    I am trying to run FASTQC v0.11.2 on a CentOS 6.6 and I get the following error message:

                    Exception in thread "main" java.lang.NoClassDefFoundError: org/itadaki/bzip2/BZip2InputStream
                    at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:104)
                    at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:62)
                    at uk.ac.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:122)
                    at uk.ac.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:95)
                    at uk.ac.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:308)
                    Caused by: java.lang.ClassNotFoundException: org.itadaki.bzip2.BZip2InputStream
                    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
                    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
                    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
                    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

                    I tried different java version, but to no avail. Any idea what's wrong?

                    Thanks

                    Comment


                    • Originally posted by makost View Post
                      Dear Simon,
                      I am trying to run FASTQC v0.11.2 on a CentOS 6.6 and I get the following error message:

                      Exception in thread "main" java.lang.NoClassDefFoundError: org/itadaki/bzip2/BZip2InputStream
                      at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:104)
                      at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:62)
                      at uk.ac.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:122)
                      at uk.ac.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:95)
                      at uk.ac.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:308)
                      Caused by: java.lang.ClassNotFoundException: org.itadaki.bzip2.BZip2InputStream
                      at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
                      at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
                      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
                      at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

                      I tried different java version, but to no avail. Any idea what's wrong?

                      Thanks
                      That error suggests that one of the bundled jar files which ships with FastQC (jbzip2-0.9.jar) was missing from your installation.

                      Quickest and easiest fix would be to download the latest version, extract the contents and try that (which will contain a fresh copy of the missing library). If that doesn't work then email me directly and I can go through some other debugging with you.

                      Simon.

                      Comment


                      • Hi everyone,

                        I'm using the latest version of fastQC to examine simulated RNA-Seq data. The problem is, because it's simulated data there is no tile position in the header and I get a warning or error (depending on the read length).

                        So I know there should be the option of setting "tile" on ignore in the limits.txt file. But no matter how I try to include my adjusted limits file (for example adjusting the file in the original folder or including an adjusted copy of the limits file somewhere else via the "-l" argument), I'm still reproducing the same output: A warning about the per-tile qualities for short reads, and the out of memory error for longer reads.
                        Too many tiles (>500) so giving up trying to do per-tile qualities since we're probably parsing the file wrongly
                        Exception in thread "Thread-1" java.lang.OutOfMemoryError: GC overhead limit exceeded
                        at java.util.Arrays.copyOfRange(Arrays.java:2694)
                        at java.lang.String.<init>(String.java:203)
                        at java.io.BufferedReader.readLine(BufferedReader.java:349)
                        at java.io.BufferedReader.readLine(BufferedReader.java:382)
                        at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:175)
                        at uk.ac.babraham.FastQC.Sequence.FastQFile.next(FastQFile.java:125)
                        at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:76)
                        at java.lang.Thread.run(Thread.java:744)
                        Which I interpret as per-tile quality still being executed (at least until fastQC realises that it won't work).


                        I am using v0.11.3. and have no other version on my systems. Nevertheless, I tried setting "adapter" on ignore instead of "tile", to check if there was still the mix up with the parameters:
                        Aaargh - I'd forgotten that one of the other pending fixes for the next release was that the disable didn't work for the per-tile module (it will actually disable it if you turn of the adapter module as it was reading the wrong parameter)
                        That worked, meaning the adapter module was turned off as it should be. So now I have to ask: Is it possible that in the latest version fastQC it is now not anymore reading the parameter belonging to the adapter module when it comes to the tile module, but that it is still somehow not reading the tile module parameter?

                        As my data is simulated and not uploaded anywhere I cannot post a link. But as this is a matter of whether the tile module is being executed or not it should be reproduceable with any fastq data.

                        I would be very glad if you could tell me if you can confirm this observation concerning the attempt to turn of the tile module, or explain how I should use the limits file in the correct manner if my incorrent use is causing this problem.
                        Thanks in advance!

                        Comment


                        • I've had a look at this and can kind of confirm what you're seeing. Turning on the ignore flag for the per-tile module does now exclude that module from appearing in the report, however it wasn't stopping the statistics from being collected which is why you were seeing the same problem even having disabled it.

                          I've modified the code so that the module shouldn't collect any stats which should fix your problem. Can you please try out the development snapshot below and see if that does what you need (let me know if you need an OSX version).

                          http://www.bioinformatics.babraham.a...11.4_devel.zip

                          Cheers

                          Simon.

                          Comment


                          • Thanks for having a look!

                            Ok, I wouldn't know if the tile module was included in the report or not, because I would always get a warning and no tile report because of my headers

                            As far as I can see by now, the snapshot you posted seems to be doing exactly what it's supposed to do.
                            I'm neither getting a warning nor an error when I run it with "tile" set on ignore and all the other modules are still in the report.

                            (Sorry, in my last post I forgot to mention I'm running FastQC on a Linux system, so no, I don't need the OSX version.)

                            Comment


                            • Hi Simon,

                              Is there a way in fastqc to turn on reporting for every position rather than the default, 5 bp window for some analyses like the "Per base seqeunce quality"?

                              Thanks,
                              Sridhar

                              Comment


                              • Originally posted by sridharacharya View Post
                                Hi Simon,

                                Is there a way in fastqc to turn on reporting for every position rather than the default, 5 bp window for some analyses like the "Per base seqeunce quality"?

                                Thanks,
                                Sridhar
                                Yes, add the option "--nogroup" to your command line.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 06:37 PM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, Yesterday, 06:07 PM
                                0 responses
                                9 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                49 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                67 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X