Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #46
    I'd be interested in that, and I'd be prepared to submit some data.

    NGSfan: I'm mainly interested in benchmarking our system against others. Because I'm running the machine, I'm always interested in how well it's performing in comparison to other sequencing service providers!

    Comment


    • #47
      Excellent utility Simon. Thank you.

      I'm running into what looks like an old bug, however. I'm using FASTQC version 0.3.1 on a SunOS 5.10 server and I'm getting a HeadlessException. Any tips on solving this?

      Code:
      Exception in thread "main" java.awt.HeadlessException: 
      No X11 DISPLAY variable was set, but this program performed an operation which requires it.
              at sun.java2d.HeadlessGraphicsEnvironment.getDefaultScreenDevice(HeadlessGraphicsEnvironment.java:65)
              at javax.swing.RepaintManager.getVolatileOffscreenBuffer(RepaintManager.java:583)
              at javax.swing.JComponent.paintDoubleBuffered(JComponent.java:4911)
              at javax.swing.JComponent.paint(JComponent.java:996)
              at uk.ac.bbsrc.babraham.FastQC.Graphs.QualityBoxPlot.paint(QualityBoxPlot.java:81)
              at uk.ac.bbsrc.babraham.FastQC.Graphs.QualityBoxPlot.paint(QualityBoxPlot.java:75)
              at uk.ac.bbsrc.babraham.FastQC.Modules.PerBaseQualityScores.makeReport(PerBaseQualityScores.java:184)
              at uk.ac.bbsrc.babraham.FastQC.Report.HTMLReportArchive.<init>(HTMLReportArchive.java:63)
              at uk.ac.bbsrc.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:82)
              at uk.ac.bbsrc.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:28)
              at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:71)
      Last edited by lparsons; 06-03-2010, 08:30 AM.

      Comment


      • #48
        Originally posted by lparsons View Post
        I'm running into what looks like an old bug, however. I'm using FASTQC version 0.3.1 on a SunOS 5.10 server and I'm getting a HeadlessException. Any tips on solving this?

        Code:
        Exception in thread "main" java.awt.HeadlessException: 
        No X11 DISPLAY variable was set, but this program performed an operation which requires it.
                at sun.java2d.HeadlessGraphicsEnvironment.getDefaultScreenDevice(HeadlessGraphicsEnvironment.java:65)
        That's really strange. It's throwing a Headless exception from within the HeadlessGraphicsEnvironment! That means that the headless environment is being correctly set. (which was the original bug which was fixed in an earlier revision).

        At first glance this looks like it has to be a bug in the core java class - especially as it seems to be SunOS specific.

        As a test can you try setting a DISPLAY environment variable and see if it then works. It may be a redundant check for something which isn't actually required.

        Comment


        • #49
          Nice work Simon, this is a simple and easy to use package.

          Comment


          • #50
            Originally posted by Thomas Doktor View Post
            The qualities look fine so it's not an issue of bad base calling. I think you could be right that the cluster calling and/or sequencing chemistry could explain some of it. Could perhaps explain why certain sequences in the genome are less likely to be sequenced, we often see peaks and valleys in exons in our RNA-seq runs which are most likely explained by sequencing artefacts.
            I have seen the same phenomena but only with our mRNA-Seq libraries. Our genomic libraries do not show any biases. Has anyone else experienced this? Could it be an artifact of the Illumina library preparation protocol, may be at the fragmentation step?

            Comment


            • #51
              The illumina RNA protocol uses random hexamers to amplify the RNA. The thing is they are not 100% random so the beginning looks skewed for base composition, but that's because of the amplification.

              For mapping it's no problem. For assembly it might confuse some assemblers. (When assembling I would trim the 5' of RNA, not for mapping)

              Comment


              • #52
                I just came across a reference to the following article in a different thread.
                Generation of cDNA using random hexamer priming induces biases in the nucleotide composition at the beginning of transcriptome sequencing reads from the Illumina Genome Analyzer. The bias is independent of organism and laboratory and impacts the uniformity of the reads along the transcriptome. We pr …

                It also attributes the biases to random priming.


                Eric

                Comment


                • #53
                  FastQC v0.4 released

                  I've just put FastQC v0.4 up on our website.

                  FastQC v0.4 introduces a new analysis module, an easier way to launch the program from the command line and a new output file, as well as fixing a few minor bugs.

                  The new analysis module is the sequence duplication level module. This is a complement to the existing overrepresented sequences module in that it looks at sequences which occur more than once in your data. The new module takes a more global view and says what proportion of all of your sequences occur once, twice, three times etc. In a diverse library most sequences should occur only once. A highly enriched library may have some duplication, but higher levels of duplication may indicate a problem, such as a PCR overamplification.

                  In response to several requests we've also now introduced a new output file into the report. This is a text based, tab delimited file which includes all of the data show in the graphs in the graphical report. This would allow people
                  running pipelines to store the data generated by fastQC and analyse it systematically rather than just taking the pass/fail/warn summary, or reviewing the reports manually.

                  Finally, if you're running fastqc from the command line we've now included a 'fastqc' wrapper script which you can launch directly rather than having to construct a java launch command. You can still pass -Dxxx options through to the program, but for simple analyses you can now simply run:

                  fastqc [some files]

                  ..once you have included the FastQC install directory into your path. More details are in the install document.

                  You can get the new version from:

                  http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

                  [If you don't see the new version of any page hit control+refresh to force our cache to update]

                  Comment


                  • #54
                    Fantastic! I really like the command line ability - really good for pipelines.

                    Also nice that you display the Quality score type (Illumina v#/Sanger) in your output - helps to sort out confusion quickly when going through older data, especially after all of Illumina's schizophrenic quality score changes .

                    Comment


                    • #55
                      I'd like to run FastQC on SOLiD reads. I saw that someone did this using solid2fastq. Is it possible to do it without running solid2fastq? IE, would it work with only the SOLiD 'quals' file?

                      EDIT: After running FastQC on SOLiD files converted to fastq files via solid2fastq, the results file says (under basic statistics):
                      File type Conventional base calls

                      Should it have recognized it as colorspace?

                      Thanks!
                      Last edited by agc; 06-20-2010, 03:59 AM.

                      Comment


                      • #56
                        Hi Simon,

                        Thanks for the new features in FastQC v0.4.
                        I just installed v0.4 but got the error below when running it on a fastq file (I had previously run v0.3 on this file with no issues.)

                        Processing sequence.fastq
                        Approx 5% complete for sequence.fastq
                        Exception in thread "AWT-AppKit" Exception in thread "Thread-3" java.lang.OutOfMemoryError: Java heap space

                        Comment


                        • #57
                          Originally posted by mard View Post
                          Processing sequence.fastq
                          Approx 5% complete for sequence.fastq
                          Exception in thread "AWT-AppKit" Exception in thread "Thread-3" java.lang.OutOfMemoryError: Java heap space
                          The error is because the program ran out of memory. The new version will use a bit more memory than the previous version since it looks at more sequences for the overrepresented sequence module. I've tested it with up to four 20million+ files open at the same time though and it was OK.

                          Can you let me know the exact command you are using to launch the program. If you're using the full java command you need to ensure that you add the -Xmx250m option to allocate a larger than default memory block to the program. If you use the fastqc wrapper then this should be added automatically.

                          Comment


                          • #58
                            Originally posted by agc View Post
                            I'd like to run FastQC on SOLiD reads. I saw that someone did this using solid2fastq. Is it possible to do it without running solid2fastq? IE, would it work with only the SOLiD 'quals' file?
                            It will work with colorspace fastq files - you don't need to convert to base calls. I don't work with SOLID data directly so I'm not sure whether this is produced directly by the pipeline or not. I'm happy to look at other alternatives for SOLID data, but the program is fairly tied to fastq format (ie needs to work with a sequence and an encoded quality string).

                            Originally posted by agc View Post
                            EDIT: After running FastQC on SOLiD files converted to fastq files via solid2fastq, the results file says (under basic statistics):
                            File type Conventional base calls

                            Should it have recognized it as colorspace?
                            It depends on the conversion. If you look in the file you'll either see conventional base calls (something like GATCTCTAGATCTCT) or colorspace calls (something like G1324132431432434312). If you see colorspace calls and the report says conventional calls then can you send me the top few lines of the file and I can see why it's going wrong. It may be that your conversion program converted to base calls already though.

                            It FastQC gets the file type wrong it's normally pretty obvious since most of the graphs will show very weird results.

                            Comment


                            • #59
                              Originally posted by simonandrews View Post
                              The error is because the program ran out of memory. The new version will use a bit more memory than the previous version since it looks at more sequences for the overrepresented sequence module. I've tested it with up to four 20million+ files open at the same time though and it was OK.

                              Can you let me know the exact command you are using to launch the program. If you're using the full java command you need to ensure that you add the -Xmx250m option to allocate a larger than default memory block to the program. If you use the fastqc wrapper then this should be added automatically.

                              Thanks for the quick reply Simon.

                              The command I'm using is:

                              Code:
                              java -Xmx250m -classpath /Tools/FastQC/ uk.ac.bbsrc.babraham.FastQC.FastQCApplication sequence.fastq
                              and the sequence.fastq file I'm running it on is 2.9Gb (~17million 75bp reads)

                              Comment


                              • #60
                                Originally posted by mard View Post
                                Thanks for the quick reply Simon.

                                The command I'm using is:

                                Code:
                                java -Xmx250m -classpath /Tools/FastQC/ uk.ac.bbsrc.babraham.FastQC.FastQCApplication sequence.fastq
                                and the sequence.fastq file I'm running it on is 2.9Gb (~17million 75bp reads)
                                Maybe it's the longer sequence length which is causing the problem. Can you try changing the -Xmx250m to -Xmx500m and see if that works.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 06:37 PM
                                0 responses
                                8 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, Yesterday, 06:07 PM
                                0 responses
                                8 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                49 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                66 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X