Go Back   SEQanswers > Applications Forums > Sample Prep / Library Generation

Similar Threads
Thread Thread Starter Forum Replies Last Post
interprete and filter repeatmodeler output balaena Bioinformatics 0 05-20-2015 09:27 AM
Gviz problem---too many stacks to draw angel-sakura Bioinformatics 1 01-04-2015 02:20 AM
Stacks exce_velvetg not present bio_jit Bioinformatics 0 07-02-2014 10:43 PM
Stacks process_radtags N in overhang CGO Bioinformatics 0 06-18-2014 07:17 AM
Read stacks in RNA-seq jwaage Illumina/Solexa 18 09-23-2009 08:16 AM

Thread Tools
Old 02-07-2017, 07:48 AM   #1
Location: Montpellier (France)

Join Date: May 2008
Posts: 93
Default How to interprete Stacks partial results?

I'm posting here but I'm not sure I'm in the correct section of the forum. If so, I apologize in advance.

So, we are performing RADseq experiments.
DNA comes from differents organisms and is extracted (by the users of our facility) with various methods with various results, of course. Most of the time, integrity of the DNA is not checked and purity is "so so" (pigments or precipitates are not rare).

We build the librairies according to Baird and al. protocol with minor modifications in it (AMPure XP purification, QubiT quantification after the first ligation, Pippin HT sizing).

We use SR100nt sequencing mode (usually, a rapid run on an Hiseq2500).
We don't perform the analysis in house but just use Stacks to perform demultiplexing.

Usually, we end up with 75 to 89% of the sequences that include index and
enzyme cutting site which seems ok for us and our users.

But from time to time, results are not as good as that.

For example, in one of our experiments, we generated 171.000.000 sequences and, after using Stacks' process_radtags module (no mismatch allowed), we ended up with:
- 58.71% of correct sequences.
- 38% of "amiguous barcode" sequences.
- 3% of "ambiguous RADtag" sequences.

We also performed demultiplexing allowing 1 mismatch and it did not improve the results a lot (63% of correct sequences).
It was more or less expected as we only found a few sur-represented indexes in our "ambiguous barcode" sequences and most of them can be explained by a small drop in sequence quality on cycle 3.
But it leaves us with 30.000.000 sequences with an ambiguous barcode that we can not explain.

Does some of you know what are the causes for:
- High percentage (15 to 40%) of "ambiguous barcode"?
- High percentage (10 to 30%) of "ambiguous RADtag"?

Is it due to DNA quality/integrity? To an issue with our adaptors?
Is it possible to perform a diagnosis of the DNA generation/library construction just by looking at these partial Stacks results?

Thanks in advance for your answer.

huguesparri is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 12:58 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO