Unconfigured Ad

**jjohnson** · 01-27-2012, 10:22 AM

I would bin the valid pairs and singletons (those with mates removed due to quality trimming/filtering) into 2 separate fastq files. Velvet can accept mutiple files and then you can paramertize around the files (such as specifying insert sizes for mates file, etc).

i.e.

velveth Assem 35 -shortPaired -fasta pe_lib1.fasta -short3 se_lib1.fa

**nposnien** · 03-05-2012, 01:29 PM

Hi,
we are facing the same problem at the moment. We will have uneven files (one for each pair) after trimming/filtering.

My question is, if there is a script/program out there that would find the mates in in two different files (or in one file if I would merge/shuffle the files prior to trimming/filtering) and bins the unpaired reads into an extra file?

Any help is highly appreciated!

**rahularjun86** · 03-05-2012, 01:42 PM

Dear nposnien,
you can use Sickle tool (https://github.com/najoshi/sickle). You only need to input the pair fastq files, and other parameters (scoring system used, quality score to keep and length cutoff etc.), and it will generate the paired and singleton files.
If you want to filter out reads with N's, Just replace the whole sequence with N and quality with #, then set Sickle length and quality values. This way it will filter out reads with N's.
Best wishes,
Rahul

**LizBent** · 03-06-2012, 12:21 AM

This script may be useful for interleaving pairs for Velvet (and generating non-paired singleton files):

denovo-assembly-tutorial/scripts/interleave_pairs.py at master · lexnederbragt/denovo-assembly-tutorial

https://github.com/lexnederbragt/denovo-assembly-tutorial/blob/master/scripts/interleave_pairs.py

A tutorial for learning de novo assembly. Contribute to lexnederbragt/denovo-assembly-tutorial development by creating an account on GitHub.

**nposnien** · 03-06-2012, 05:13 AM

First of all, thanks for the answers!

@ LizBent: Can I use the script for data that has been processed using CASAVA 1.8? In the discussion you added a link to, it is proposed to replace

f_suffix = "/1"
r_suffix = "/2"

with

f_suffix = ""
r_suffix = ""

My question is: How are the pairs identified then?

**LizBent** · 03-06-2012, 05:25 AM

No idea, you might want to ask the original script writer, who is cited in the comments at the top of the script (and there is also a reference to another SeqAnswers thread there that might answer your question). Sorry I can't help.

Topics	Statistics	Last Post
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 11 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 18 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 52 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 111 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM

Unconfigured Ad

Velvet paired end after some sequences removed?

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News