Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • standonn
    Member
    • Nov 2014
    • 14

    FastQC: 2 peak per sequence GC content

    Dear all,

    I have some genomic pair-end data from a nematode. I ran FASTQC to have an overview of the data.

    I was surprised to see a "Per sequence GC content" graph with 2 peaks (see image attached).
    I ran trimmomatic but the graph of per sequence GC content remained the same.
    Do you know why I get this profile?

    Best,
    Sophie
    Attached Files
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    It *may* be indicative of contamination from an unrelated species/source. Have you tried to analyze the data? Is this a simple WGS experiment?

    Comment

    • bastianwur
      Member
      • Feb 2014
      • 98

      #3
      What should be the normal GC content? 41? Is there anything within the genome, which could have the other GC content?

      I had once also 2 peaks in some samples.
      Was a low GC bacterium (30%). The second peak (50%) turned out to be totally from the rRNA operons within this bacterium. Our guess was that the GC bias of the adapter ligation kicked somehow in, and ruined the dataset. The supplier doesn't know what happened.
      I'm not sure if that could be the case here, because I don't know if you have biological differences within the DNA in your sample, but is probably worth checking.

      Comment

      • standonn
        Member
        • Nov 2014
        • 14

        #4
        Hello GenoMax and bastianwur,

        Thanks a lot for your answers.

        We don´t know what the GC content is for this species. We do think it is around 35-40% as in other worms.

        After talking to the people in my lab, the second peak around 70% could very much be due to a bacterium present in the gut of the worm.

        Otherwise, the strain used is inbred but I believe still presents biological differences. I wouldn´t say that would explain the 2nd peak though.

        Do you think it is still possible to do a genome assembly on this data?

        Anyhow, thanks for your answers,
        Sophie

        Comment

        • bastianwur
          Member
          • Feb 2014
          • 98

          #5
          We're normally assembling here meta-genomes and -transcriptomes, and haven't encountered many problems with the different species.
          One of my colleagues has a paper in submission, where they investigated that and got very little false assemblies.
          -> assembling 2 totally different organisms from this dataset shouldn't be a problem.
          You might have to do some QA though, to ensure that everything gets corretly assigned/separated.

          Comment

          • HESmith
            Senior Member
            • Oct 2009
            • 512

            #6
            Hi Sophie,

            We observed a similar bimodal distribution from C. elegans samples contaminated with Streptomyces (and the relative height of the high-GC peak varied with the degree of contamination). You could BLAST a sampling of the GC-rich reads and see if they match any known species.

            Comment

            • GenoMax
              Senior Member
              • Feb 2008
              • 7142

              #7
              If you know what that bacterium (present in the gut) is (and if a genome is available for that species or a close relative) you could try to separate your reads into two pools before trying assembly.

              You can do that easily with BBSplit.

              Comment

              • standonn
                Member
                • Nov 2014
                • 14

                #8
                Dear all,

                Sorry for the late reply.
                Thanks a lot for your answers! They were much appreciated.

                Unfortunately, I don´t know the gut bacterium of this nematode. But I´ll try doing what HESmith suggested and see if its sequenced I´ll do what GenoMax suggested.

                To Genomax: thanks for telling me about BBSplit! I didn´t know about that tool.

                To Bastianwur: Your message made me very happy! It is very good to know that there shouldn´t be problems assembling this peculiar data. Good luck for the publishing!

                Cheers,
                Sophie

                Comment

                Latest Articles

                Collapse

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                31 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                38 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                43 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 12:03 PM
                0 responses
                64 views
                0 reactions
                Last Post SEQadmin2  
                Working...