Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #91
    I've just put up v0.4.3 on our website which fixes the sequence count problem.

    Comment


    • #92
      find contaminant sequence

      hello Simon,

      I use FastQC to evauate my sequence data.
      The last part is contaminant(overrepresented sequences)

      Total Sequences 9265299
      Sequence length 42

      It like this:
      {
      >>Overrepresented sequences fail
      #SequenceCountPercentagePossible Source
      GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTAGATCGGAAG 119288 1.2874705932317998 Illumina Single End Apapter 2 (96% over 32bp)
      GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAAAAAAAA 112538 1.2146181143209733 Illumina Single End Apapter 2 (100% over 33bp)
      AATTCGAATATCTGCCGAATGCCGTGTGGACGTAAGCGTGAA 29127 0.3143665412200945 No Hit
      GATCGGAAGAGCTGTATGCCGTCTTCTGCTTAGATCGGAAGA 24460 0.2639957976531572 No Hit
      AATTCACAGGTGTTCTCCCGTATTGTTGACATGCCAGCGGGT 20305 0.21915104952360417 No Hit
      AATTCCCCTTGATTGCAAGGGGAACGAAATAGACAGATCGCT 17190 0.18553097962623763 No Hit
      }

      How can I find these contaminant sequences from all data?
      use fastQC or bioperl module? or other algorithms?

      Is this data's quality too poor that we can not use it to analysis ?


      Thank you very much

      Comment


      • #93
        Originally posted by flower6991 View Post
        hello Simon,

        I use FastQC to evauate my sequence data.
        The last part is contaminant(overrepresented sequences)

        Total Sequences 9265299
        Sequence length 42

        It like this:
        {
        >>Overrepresented sequences fail
        #SequenceCountPercentagePossible Source
        GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTAGATCGGAAG 119288 1.2874705932317998 Illumina Single End Apapter 2 (96% over 32bp)
        GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAAAAAAAA 112538 1.2146181143209733 Illumina Single End Apapter 2 (100% over 33bp)
        AATTCGAATATCTGCCGAATGCCGTGTGGACGTAAGCGTGAA 29127 0.3143665412200945 No Hit
        GATCGGAAGAGCTGTATGCCGTCTTCTGCTTAGATCGGAAGA 24460 0.2639957976531572 No Hit
        AATTCACAGGTGTTCTCCCGTATTGTTGACATGCCAGCGGGT 20305 0.21915104952360417 No Hit
        AATTCCCCTTGATTGCAAGGGGAACGAAATAGACAGATCGCT 17190 0.18553097962623763 No Hit
        }
        So this is saying that you have some adapter contamination in your sample. You've probably lost 5-10% of your sequences to this contamination, but there's no reason to think that the rest of it won't be usable.

        Originally posted by flower6991 View Post
        How can I find these contaminant sequences from all data?
        use fastQC or bioperl module? or other algorithms?
        FastQC is not intended to be a filter - merely just to report on the state of your data. There are plenty of other tools out there which you can use to remove these contaminants if you need to do that before running the rest of your analyses.

        Originally posted by flower6991 View Post
        Is this data's quality too poor that we can not use it to analysis ?
        There's nothing in this result to suggest that - it simply shows that the data is contaminated. You need to look at the rest of the results as well to assess the overall quality of your data.

        FastQC output shouldn't be taken too literally. Just because you get a red cross against one or more tests doesn't necessarily mean that you should throw your data away. I can think of legitimate reasons why some data sets would fail every single one of the tests - and that's OK. What the program aims to do is to point things out to you ("Did you know that 3 sequences make up 50% of your data?" etc). Beyond that it's really up to you to decide if this means that the data is too poor to use, if you go ahead - but bear the FastQC results in mind in your interpretation, or if you decide the warning is spurious for the type of data you're analysing.

        For example - every one of our PhiX control lanes now fails QC as assessed by FastQC because the degree of sequence duplication is ridiculously high. This is both a correct and irrelevant result. In a supposedly diverse library this would indicate a real problem, but in a PhiX lane we expect that. You have to judge the results based on your knowledge of the experiment.

        Comment


        • #94
          Fastqc: Version 0.5.0

          When I run fastqc in the home directory ~/bin/FastQC, I got this error.

          java -Xmx250m -classpath . uk.ac.bbsrc.babraham.FastQC.FastQCApplication


          Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication

          java version "1.5.0_17"
          Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_17-b04)
          Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_17-b04, mixed mode)

          Comment


          • #95
            Originally posted by fabrice View Post
            Fastqc: Version 0.5.0

            When I run fastqc in the home directory ~/bin/FastQC, I got this error.

            java -Xmx250m -classpath . uk.ac.bbsrc.babraham.FastQC.FastQCApplication


            Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
            This will be because you have an existing classpath defined and you need to add the new directory to it, rather than replacing it.

            If you're running fastqc on a unix system from the command line it's much easier to use the wrapper script which is included in the distribution.

            In your case you'd initially need to do:

            chmod 755 ~/bin/FastQC/fastqc

            ..then in future you can do:

            ~/bin/FastQC/fastqc [your list of files]

            Comment


            • #96
              The script fastqc does not work for command line.
              On mac:
              java -version
              java version "1.6.0_20"
              Java(TM) SE Runtime Environment (build 1.6.0_20-b02-279-10M3065)
              Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01-279, mixed mode)

              ./fastqc aa.txt
              Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
              Caused by: java.lang.ClassNotFoundException: uk.ac.bbsrc.babraham.FastQC.FastQCApplication
              at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
              at java.security.AccessController.doPrivileged(Native Method)
              at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
              at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
              at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
              at java.lang.ClassLoader.loadClass(ClassLoader.java:248)

              On debian:

              java -version
              java version "1.5.0"
              gij (GNU libgcj) version 4.3.2

              Copyright (C) 2007 Free Software Foundation, Inc.
              This is free software; see the source for copying conditions. There is NO
              warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

              ./fastqc a.txt
              Exception in thread "main" java.lang.NoClassDefFoundError: uk.ac.bbsrc.babraham.FastQC.FastQCApplication
              at gnu.java.lang.MainThread.run(libgcj.so.90)
              Caused by: java.lang.ClassNotFoundException: uk.ac.bbsrc.babraham.FastQC.FastQCApplication not found in gnu.gcj.runtime.SystemClassLoader{urls=[file:./,file:~/bin/FastQC/,file:~/bin/FastQC/], parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}
              at java.net.URLClassLoader.findClass(libgcj.so.90)
              at java.lang.ClassLoader.loadClass(libgcj.so.90)
              at java.lang.ClassLoader.loadClass(libgcj.so.90)
              at gnu.java.lang.MainThread.run(libgcj.so.90)


              Originally posted by simonandrews View Post
              This will be because you have an existing classpath defined and you need to add the new directory to it, rather than replacing it.

              If you're running fastqc on a unix system from the command line it's much easier to use the wrapper script which is included in the distribution.

              In your case you'd initially need to do:

              chmod 755 ~/bin/FastQC/fastqc

              ..then in future you can do:

              ~/bin/FastQC/fastqc [your list of files]

              Comment


              • #97
                On unbantu:

                java -version
                java version "1.6.0_18"
                Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
                Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)

                Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
                Caused by: java.lang.ClassNotFoundException: uk.ac.bbsrc.babraham.FastQC.FastQCApplication
                at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
                at java.security.AccessController.doPrivileged(Native Method)
                at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
                at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
                at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
                at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
                Could not find the main class: uk.ac.bbsrc.babraham.FastQC.FastQCApplication. Program will exit.

                Comment


                • #98
                  Originally posted by fabrice View Post
                  Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
                  Could you by any chance have downloaded the source distribution instead of the compiled version? The errors are all saying that java can't find the initial class file, which it should be able to if the classpath is set correctly.

                  Can you look in uk/ac/bbsrc/babraham/FastQC/ and see if you see a file called FastQCApplication.class. If you see a file called FastQCApplication.java then you've got the source files rather than the binaries.

                  Comment


                  • #99
                    The files are:

                    Analysis FastQCApplication.java Graphs Modules Resources Sequence
                    Dialogs FastQCMenuBar.java Help Report Results Statistics

                    Originally posted by simonandrews View Post
                    Could you by any chance have downloaded the source distribution instead of the compiled version? The errors are all saying that java can't find the initial class file, which it should be able to if the classpath is set correctly.

                    Can you look in uk/ac/bbsrc/babraham/FastQC/ and see if you see a file called FastQCApplication.class. If you see a file called FastQCApplication.java then you've got the source files rather than the binaries.

                    Comment


                    • Originally posted by fabrice View Post
                      The files are:
                      FastQCApplication.java
                      Those are the source code files (which is why it won't run). You need to download the compiled version, either the generic zip file or the Mac application bundle.

                      Comment


                      • FastQC v0.5.1 has been released. This fixes a formatting bug in the text output and a bug in the %GC profile for runs containing reads >100bp.

                        We've also improved the fitting of the modelled curve to the %GC profile and have added a load more oligos to the contaminants file (thanks to Aaron Statham for providing these).

                        You can get the new version from:

                        http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/


                        [If you don't see the new version of any page hit shift+refresh to force our cache to update]

                        Comment


                        • I should add that for announcements of new releases to all of the software from Babraham (FastQC, SeqMonk, Bismark etc) you can now follow us on twitter at @babraham_bioinf. I'll still post more complete announcements here though.

                          Comment


                          • Hi Simon,
                            May I know how to use the tool in linux environment? Thanks.

                            Comment


                            • Originally posted by seq_GA View Post
                              Hi Simon,
                              May I know how to use the tool in linux environment? Thanks.
                              Instructions for installing and running the program on a variety of platforms are in the INSTALL.txt document which comes in the distribution. On linux there is a wrapper script which you can use to run the program which is probably the easiest way to launch it.

                              Comment


                              • Hi Simon,
                                Thanks for your response. I am trying to use this as part of the pipeline and hence didn't try it through win32 to access the linux server.

                                I tried as below and please let me know the details.

                                Code:
                                FastQC]$ chmod 755 fastqc
                                FastQC]$ ./fastqc 
                                Exception in thread "main" java.awt.HeadlessException: 
                                No X11 DISPLAY variable was set, but this program performed an operation which requires it.
                                        at java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:173)
                                        at java.awt.Window.<init>(Window.java:437)
                                        at java.awt.Frame.<init>(Frame.java:419)
                                        at java.awt.Frame.<init>(Frame.java:384)
                                        at javax.swing.JFrame.<init>(JFrame.java:180)
                                        at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.<init>(FastQCApplication.java:256)
                                        at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:91)
                                Thanks.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM
                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                30 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                32 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                28 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                53 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X