Best strategy/tools to align poor quality reads on distant/degenerate short reference

denisDDS

Junior Member

Join Date: Feb 2013

Posts: 6
- Share
- Tweet
#1

Best strategy/tools to align poor quality reads on distant/degenerate short reference

04-23-2018, 02:28 AM

Hello everybody,

Here is my problem:
I have +/-50 samples that I sequenced to examine SNP at key positions. I know my key positions.
I have 4 reference sequences (+/-2000bp). Two of them are encoded with IUPAC nucleotide code and 2 of them with "ATGC" code. These references could be relativly far tha the obtained reads
I have high size (1000bp) amplicon paired reads data. As the amplicon size is far larger than my reads, I will work on my reads files separatly.
My reads quality is really poor, the base quality dropping rapidly below 20 around 90 bp (on 250bp reads).

What I did:
I started to aligne my reads direcly with bwa mem on my references, changeing the seeds quality and mismatch scores...
The I wanted to perform snp calling with freebayes. Unfortunatly, I don't think this is a good idea (I don't know the ploidy of my samples).
Then, I discoverd that my reads had a really poor quality, then I decided to had a first step of read cleaning, with trimmomatic. (SE -threads 8 -phred33 -trimlog trim_${readsname}.log ${file} ${OUTDATADIRECTORY}/${readsname}_001.trimmed.fastq.gz \ILLUMINACLIP:./adapters/NexteraPE-PE.fa:2:30:10 LEADING:10 TRAILING:10 SLIDINGWINDOW:4:15 AVGQUAL:30 MINLEN:36)
After talking with some friends, and seeing the reads siez dropping, I decided to use bwa aln and perform the snp calling with sammtools/bcftools. But the alignement parameters are harder to correctly set.

Then, my questions:
- Are my pipeline steps good? must I clean my reads before align them, or the alignement will take quality into account andthis step is useless?
- Which programs must I used for this different steps? I saw some peaople use Mosaik, ssaha, bfast, novoalign, etc... Which one is the best for my particular problem?
- Which snp calling method/program must I used? as I am focusing on specific known position, is a Bayesian haplotype-based usefull or mpilup is enough?
- Do you have any advice for me? (this is my first experiment with this kind of data, before I worked on high quality human data...).

As said a famous people "Help me, Obi-Wan Kenobi. You're my only hope."

Thx in advance.

Last edited by denisDDS; 04-23-2018, 03:31 AM.
Tags: None

Previous template Next

Advancing Precision Medicine for Rare Diseases in Children

by seqadmin

Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
- Channel: Articles
12-16-2024, 07:57 AM
Recent Advances in Sequencing Technologies

by seqadmin

Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

Long-Read Sequencing
Long-read sequencing has seen remarkable advancements,...
- Channel: Articles
12-02-2024, 01:49 PM

Topics	Statistics	Last Post
Evaluating Genome Sequencing for ECMO Patients in the NICU by seqadmin Started by seqadmin, 12-17-2024, 10:28 AM	0 responses 32 views 0 likes	Last Post by seqadmin 12-17-2024, 10:28 AM
New Genetic Toolkit Refines Studies on Gene Function and Disease by seqadmin Started by seqadmin, 12-13-2024, 08:24 AM	0 responses 48 views 0 likes	Last Post by seqadmin 12-13-2024, 08:24 AM
Study Links Brain Mechanism to Emotional Responses in Animals and Humans by seqadmin Started by seqadmin, 12-12-2024, 07:41 AM	0 responses 34 views 0 likes	Last Post by seqadmin 12-12-2024, 07:41 AM
Study Identifies Ribosomal RNA Fingerprints as Early Cancer Biomarkers by seqadmin Started by seqadmin, 12-11-2024, 07:45 AM	0 responses 46 views 0 likes	Last Post by seqadmin 12-11-2024, 07:45 AM

Seqanswers Leaderboard Ad

Announcement

Best strategy/tools to align poor quality reads on distant/degenerate short reference

Latest Articles

ad_right_rmr

News