Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • FastQC Report

    I ran a HiSeq on environmental samples and the purpose of the run was to blast my sequences against the NCBI-nr database to see what species my reads match to. I am not doing a denovo assembly or genome assembly.

    My FastQc report passes in all aspects except: per base sequence content, per sequence GC content and kmer content.
    Should I be worried? How much should I rely on a fastqc report?

    Thank you in advance!

  • #2
    I would worry most about the per base sequence content, depending on what it looks like, why it didn't pass.

    Comment


    • #3
      Originally posted by mastal View Post
      I would worry most about the per base sequence content, depending on what it looks like, why it didn't pass.
      Below are the pictures of the FastQC failed reports:
      Attached Files

      Comment


      • #4
        The per base sequence content looks OK, you might need to trim the last few bases at the 3' ends of the reads. The kmer plot looks like there might be adapters at the 3' ends of the reads too.

        Comment


        • #5
          I would trim up the first 19 bps at the 5' end (which probably are the adapters) and trim the last 50 bps at the 3' end.

          Also I would suggest increasing the kmer count to k 10 in FastQC to get a better idea of things for the 3' end for how much to trim.

          All the best with your project.

          -Zapages

          Comment


          • #6
            Originally posted by Zapages View Post
            I would trim up the first 19 bps at the 5' end (which probably are the adapters) and trim the last 50 bps at the 3' end.
            I think not; Nextera libraries normally look like that at the beginning due to shearing bias, but the bases are correct. The 3' end looks like adapter sequence, though, and should be adapter-trimmed.

            Comment


            • #7
              Originally posted by Zapages View Post
              I would trim up the first 19 bps at the 5' end (which probably are the adapters) and trim the last 50 bps at the 3' end.

              Also I would suggest increasing the kmer count to k 10 in FastQC to get a better idea of things for the 3' end for how much to trim.

              All the best with your project.

              -Zapages
              No trimming necessary. Refer to this post by Dr. Simon Andrews, author of FastQC.

              Comment


              • #8
                Originally posted by GenoMax View Post
                No trimming necessary. Refer to this post by Dr. Simon Andrews, author of FastQC.
                Very interesting development and something that I always thought about this too when I was working on my data sets.

                Since the biased composition is created by the selection of sequencing fragments and not by base call errors the only effect of trimming would be to change from having a library which starts over biased positions, to having a library which starts slightly downstream of biased positions.

                Thank you for sharing.

                I did a lot of RNA-Seq analysis last year and earlier this year. This news was not known at that time... When I free time, I definitely will go back and check some of my old results and see if there is any improvement in my differential expression results.

                Whilst the warnings generated by this problem reflect a real issue it’s not something which can be fixed, and doesn’t seem to have any serious consequences for downstream analysis. Ironically if you are producing RNA-Seq libraries it would make for better QC if you were to focus on libraries which didn’t have this artefact in them, as they would be the ones which were truly suspicious.
                I guess, we should go with more expensive PCR-free approaches: https://konradpaszkiewicz.wordpress....biased-genome/

                Thoughts?

                Would you recommend this approach for older generated data that used TruSeq Library Prep kits or had 5' that were really messy? As I think back, I remember dealing with some pretty messy RNA-Seq that had to be cleaned up from Illumina HiSeq 2500 machines. I will give my old results another look when I am free.

                Comment


                • #9
                  Originally posted by Brian Bushnell View Post
                  I think not; Nextera libraries normally look like that at the beginning due to shearing bias, but the bases are correct. The 3' end looks like adapter sequence, though, and should be adapter-trimmed.
                  Hey, it was adapter trimmed at the 3' end! So i'm not sure what is going on..suggestions?

                  Comment


                  • #10
                    Originally posted by GenoMax View Post
                    No trimming necessary. Refer to this post by Dr. Simon Andrews, author of FastQC.
                    This would explain the 5' end, but how about the 3' end?

                    Comment


                    • #11
                      You may have used the wrong adapter sequences, or simply had incomplete trimming. I suggest starting with the raw reads and performing adapter-trimming as in the post I linked, then looking at the results.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM
                      • seqadmin
                        Techniques and Challenges in Conservation Genomics
                        by seqadmin



                        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                        Avian Conservation
                        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                        03-08-2024, 10:41 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 06:37 PM
                      0 responses
                      8 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, Yesterday, 06:07 PM
                      0 responses
                      8 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-22-2024, 10:03 AM
                      0 responses
                      49 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-21-2024, 07:32 AM
                      0 responses
                      66 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X