Seqanswers Leaderboard Ad

**GenoMax** · 11-11-2014, 03:59 AM

It *may* be indicative of contamination from an unrelated species/source. Have you tried to analyze the data? Is this a simple WGS experiment?

**bastianwur** · 11-12-2014, 01:49 AM

What should be the normal GC content? 41? Is there anything within the genome, which could have the other GC content?

I had once also 2 peaks in some samples.
Was a low GC bacterium (30%). The second peak (50%) turned out to be totally from the rRNA operons within this bacterium. Our guess was that the GC bias of the adapter ligation kicked somehow in, and ruined the dataset. The supplier doesn't know what happened.
I'm not sure if that could be the case here, because I don't know if you have biological differences within the DNA in your sample, but is probably worth checking.

**standonn** · 11-12-2014, 04:51 AM

Hello GenoMax and bastianwur,

Thanks a lot for your answers.

We don´t know what the GC content is for this species. We do think it is around 35-40% as in other worms.

After talking to the people in my lab, the second peak around 70% could very much be due to a bacterium present in the gut of the worm.

Otherwise, the strain used is inbred but I believe still presents biological differences. I wouldn´t say that would explain the 2nd peak though.

Do you think it is still possible to do a genome assembly on this data?

Anyhow, thanks for your answers,
Sophie

**bastianwur** · 11-12-2014, 05:03 AM

We're normally assembling here meta-genomes and -transcriptomes, and haven't encountered many problems with the different species.
One of my colleagues has a paper in submission, where they investigated that and got very little false assemblies.
-> assembling 2 totally different organisms from this dataset shouldn't be a problem.
You might have to do some QA though, to ensure that everything gets corretly assigned/separated.

**HESmith** · 11-12-2014, 12:50 PM

Hi Sophie,

We observed a similar bimodal distribution from C. elegans samples contaminated with Streptomyces (and the relative height of the high-GC peak varied with the degree of contamination). You could BLAST a sampling of the GC-rich reads and see if they match any known species.

**GenoMax** · 11-12-2014, 05:14 PM

If you know what that bacterium (present in the gut) is (and if a genome is available for that species or a close relative) you could try to separate your reads into two pools before trying assembly.

You can do that easily with BBSplit.

**standonn** · 11-21-2014, 10:32 AM

Dear all,

Sorry for the late reply.
Thanks a lot for your answers! They were much appreciated.

Unfortunately, I don´t know the gut bacterium of this nematode. But I´ll try doing what HESmith suggested and see if its sequenced I´ll do what GenoMax suggested.

To Genomax: thanks for telling me about BBSplit! I didn´t know about that tool.

To Bastianwur: Your message made me very happy! It is very good to know that there shouldn´t be problems assembling this peculiar data. Good luck for the publishing!

Cheers,
Sophie

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 26 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 29 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 25 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

FastQC: 2 peak per sequence GC content

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News