Seqanswers Leaderboard Ad

**kmcarr** · 09-26-2014, 01:45 PM

Originally posted by MercuryMan View Post

Hello everyone! I realize this question has been asked and answered, but even after reading quite a bit I can't decide what I have or don't have...dang!

My data is fastq format and was downloaded from BaseSpace for use in third party analysis. I have run "make.contigs" using Mothur 1.31.2 on my first sample. Using both the R1 and R2 files, which I assume are my forward and reverse paired seqs. It ran fine with no errors, but as I read the sequences it really looks like they must have barcodes and/or primers still attached. I keep seeing that Illumina fastq files which have been de-multiplexed should already be trimmed of the barcodes and primers. Is this correct or do I need to somehow come up with an oligios file?

Below is a sample of the first few reads. Thank you very much for any help you can provide. This is my first run through with Illumina data and my mentor is no help as he is apparently swamped right now.

Code:

>M02146_10_000000000-A51MH_1_1101_13422_1525
TACGGAGGGAGCTAGCGTTGTTCGGAATTACTGGGCGTAAAGCGTACGTAGGCGGCTATTCAAGTCAGAGGTGAAAGCCCGGGGCTCAACCCCGGAACTGCCTTTGAAACTAGGTAGCTAGAGTCTTGGAGAGGTTAGTGGAATTCCGAGTGTAGAGGTGAAATTCGTAGATATTCGGAAGAACACCAGTGGCGAAGGCGACTAACTGGACAAGTACTGACGCTGAGGTACGAAAGCGTGGGGAGCAAACAGG
>M02146_10_000000000-A51MH_1_1101_13396_1529
TACGGAGGGAGCTAGCGTTGTTCGGAATTACTGGGCGTAAAGCGTACGTAGGCGGCTATTCAAGTCAGAGGTGAAAGCCCGGGGCTCAACCCCGGAACTGCCTTTGAAACTAGGTAGCTAGAGTCTTGGAGAGGTTAGTGGAATTCCGAGTGTAGAGGTGAAATTCGTAGATATTCGGAAGAACACCAGTGGCGAAGGCGACTAACTGGACAAGTACTGACGCTGAGGTACGAAAGCGTGGGGAGCAAACAGG
>M02146_10_000000000-A51MH_1_1101_13412_1540
TACGGAGGGAGCTAGCGTTGTTCGGAATTACTGGGCGTAAAGCGTACGTAGGCGGCTATTCAAGTCAGAGGTGAAAGCCCGGGGCTCAACCCCGGAACTGCCTTTGAAACTAGGTAGCTAGAGTCTTGGAGAGGTTAGTGGAATTCCGAGTGTAGAGGTGAAATTCGTAGATATTCGGAAGAACACCAGTGGCGAAGGCGACTAACTGGACAAGTACTGACGCTGAGGTACGAAAGCGTGGGGAGCAAACAGG

MM,

Why do you believe that your reads still have barcodes? Because you see sequences in common in all your reads? The fact that you are processing this data set with Mothur tells me that this is a 16S data set. BLASTing these three reads confirms that they are (nearly) perfect matches to the V4 region of the 16S rRNA gene, no Illumina barcodes or adapters in sight.

If you're worried because they all look identical to each other don't be, that is the expected outcome of an amplicon sequencing experiment.

**MercuryMan** · 09-28-2014, 01:00 PM

Thanks kmcarr! You confirmed this for me. I had BLASTed as well and was comforted by the V4 report, but the MiSeq administrator on my campus is notoriously vague and hard to get an answer from. I think he intentionally sabotaged some 454 data he gave me to determine whether or not I could sort out the problem on my own (which I did thank god!).

Just two days ago my major Professor decided (since he's been using it to assess my data as well without telling me) that I should use USEARCH to analyse my data. This would put me back at square one after I've spent the last 2 months getting familiar with Mothur.

So my question is...what software would you use to analyse 16S microbial metagenomic data? I have 16 separate samples from 8 sites (2 extractions per site), and I hope to do a full analysis including alpha and beta diversity, differential abundance and probably a few other statistical tests.

I welcome any comments/advice from anyone who has used more than one pipeline and conducted such an analysis.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Barcodes and Primers

Comment

Comment

Latest Articles

ad_right_rmr

News