STACKS and PE Illumina Data (ddRAD)

Carcharodon

Member

Join Date: Jul 2015

Posts: 40
- Share
- Tweet
#1

STACKS and PE Illumina Data (ddRAD)

07-13-2017, 05:09 PM

I have a dataset from a ddRAD library (Illumina HiSeq4K, 150PE). I'm playing around with STACKS, and noticed that it has some limitations when dealing with paired-end data from ddRAD datasets.

Namely, it treats them as separate/independent loci.

This seems like a problem from a population genomic point of view, since there is a base assumption (admittedly often violated) of independence between loci. If we have paired data (single-reads and paired-end ("PE") reads), we know that those two loci aren't at all likely to be independent.

What I have done so far is demultiplex individuals' data based on inline barcode sequence(s) and give a rough quality filter (sliding-window 15%, min quality score = 10). So I'm left with four files for each individual:

One file of SE reads and one file of PE reads, in-phase (where both reads from the same fragment/cluster were kept and not discarded).

One file of SE reads, whose PE counterpart has been discarded.

One file of PE reads, whose SE counterpart has been discarded.

Only one or the other read - SE or PE - needs to have a SNP. The other read can be discarded. Can STACKS keep track of this? Can it go through assembly and SNP-calling and keep track of header-titles, and use that information to figure out what those sequence pairs are?

Or is that information lost? (I suspect that for the remainders - the latter two files described - it would be quite difficult to recover that information...)

Alternatively, is it worth throwing caution to the wind and using the SE and PE reads and throwing all of those data together at the end? Is there another approach here?

Many thanks,
Sean
Tags: paired end, paired reads, stacks

Previous template Next

Recent Innovations in Spatial Biology

by seqadmin

Spatial biology is an exciting field that encompasses a wide range of techniques and technologies aimed at mapping the organization and interactions of various biomolecules in their native environments. As this area of research progresses, new tools and methodologies are being introduced, accompanied by efforts to establish benchmarking standards and drive technological innovation.

3D Genomics
While spatial biology often involves studying proteins and RNAs in their...
- Channel: Articles
01-01-2025, 07:30 PM
Advancing Precision Medicine for Rare Diseases in Children

by seqadmin

Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
- Channel: Articles
12-16-2024, 07:57 AM

Topics	Statistics	Last Post
Genomics-Driven Care in Neurodevelopmental Disorders Shows Promising Results by seqadmin Started by seqadmin, 01-09-2025, 04:04 PM	0 responses 439 views 0 likes	Last Post by seqadmin 01-09-2025, 04:04 PM
Study Questions Accuracy of Genetic Testing for Opioid Use Disorder Risk by seqadmin Started by seqadmin, 01-09-2025, 09:42 AM	0 responses 443 views 0 likes	Last Post by seqadmin 01-09-2025, 09:42 AM
New Algorithm Brings Precision and Scalability to Single-Cell RNA Analysis by seqadmin Started by seqadmin, 01-08-2025, 03:17 PM	0 responses 459 views 0 likes	Last Post by seqadmin 01-08-2025, 03:17 PM
Nanopores as Precision Diagnostic Tools in Molecular Biology by seqadmin Started by seqadmin, 01-03-2025, 11:18 AM	1 response 50 views 1 like	Last Post by Tonia 01-05-2025, 12:15 PM

Seqanswers Leaderboard Ad

Announcement

STACKS and PE Illumina Data (ddRAD)

Latest Articles

ad_right_rmr

News