Seqanswers Leaderboard Ad

**Brian Bushnell** · 12-10-2016, 01:50 PM

You might try removing human sequence, then assembling the rest and BLASTing the contigs against nt/nr/RefSeq microbial. Assuming the contigs are longer than read length, they will give you more reliable hits. What kind of depth do you have for the bacteria? You can find that out with a kmer-frequency histogram, after human reads are removed.

**bloosnail** · 12-10-2016, 10:59 PM

Thank you for the quick response. The idea of assembling the reads into contigs before alignment makes sense, I will let me supervisor know. Do you know of good software to do this? I have tried Velvet in the past but did not use it extensively.

I forgot to mention that we have removed human sequences, although the revised reference genome that you created seems like it would be especially useful for us.

Could you give more information on how to estimate the depth of the bacteria? There is generally less than 100,000 bacterial reads per sample out of 20-30 million initial reads (before any trimming/contaminant removal).

**Brian Bushnell** · 12-11-2016, 09:42 AM

I suggest Spades or Megahit for metagenome assembly. 100k is not many reads; you might not have sufficient depth for assembly. But in that case, you may get a better assembly by combining all bacterial reads from all samples and assembling together. Then you can quantify by mapping to the combined assembly.

For human removal, the raw human genome is fine in your case (bacteria). The masked version is mainly to allow decontamination of eukaryotes, which have shared sequence with human; bacteria basically don't.

**gringer** · 12-11-2016, 10:27 AM

You can create rarefaction curves to see if what you have is likely sufficient to describe the metagenomic profile.

The basic process is to remove reads and see if your calculation of the species diversity is similar. A low complexity sample will plateau at a low coverage, while the diversity of a high complexity sample will just keep increasing substantially with more reads.

**dhtaft** · 12-12-2016, 04:30 PM

I had some luck using IMSA in a similar situation to the one you describe, but only after human read removal

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Minimum amount of data needed for reliable results?

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News