SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to extract uniquely mapped paired end reads from bam file serenaliao RNA Sequencing 0 08-20-2014 11:42 AM
What do I do with my paired end reads after removing the adapters? prs321 Bioinformatics 4 01-14-2014 12:20 PM
Extract paired end reads from sff file. ojy Bioinformatics 4 12-13-2012 05:07 AM
Can we extract f3 reads while f5 reads are being sequenced in paired end Raa Bioinformatics 2 12-25-2011 09:46 PM
How to extract paired-end reads from .sff 454? pmiguel Bioinformatics 8 02-22-2010 09:17 AM

Reply
 
Thread Tools
Old 11-08-2019, 02:53 AM   #1
gspirito
Junior Member
 
Location: Italy

Join Date: Nov 2019
Posts: 2
Default Extract reads from paired-end fastq based on specific adapters with bbduk

Hello everyone, I am using bbduk.sh (from bbmap toolkit) to extract reads from paired-end fastq files based on the presence of specific adapters in the 5' of the sequence in the "_1" fastq file.

I am using this command:

Code:
./bbmap/bbduk.sh -Xmx1g in1=reads_1.fastq.gz in2=reads_2.fastq.gz outm1=matched1.fastq.gz outm2=matched2.fastq.gz literal=AAACCTGAGAAACCTA k=16 hdist=0 -rcomp=f
The problem is that other that the correct reads, the output file contains also other reads which do not include the adapter sequence, es:

# from reads_1.fastq.gz
@SRR9262917.232075 232075/1
GCATGCGAGTAGCGGTGGTTCTTATA
+
FFFFFFFFFFFFFFFFFFFFFFFFFF

# from reads_2.fastq.gz
@SRR9262917.232075 232075/2
AAGCAGTGGTATCAACGCAGAGTACATGGGATTCCATAGCCCTGTGGTTTTTATAGATCTTGTAAACCCCAAACCTGGGAAACCTAGTGGC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFF,FFFFFFFFF

Does anyone know why this may be happening and how to avoid this?

Thanks in advance.
gspirito is offline   Reply With Quote
Old 11-08-2019, 06:32 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,992
Default

You could add a "restrictleft=N" N=certain number of bases to look only in that area. Also adding "minlength=N" will exclude small reads like the first example. Also try setting k to something smaller (8) so it has better chances of matching correctly.

I hope "-rcomp=f" is a typo. There should be no - at beginning.

Last edited by GenoMax; 11-08-2019 at 06:44 AM.
GenoMax is offline   Reply With Quote
Old 11-12-2019, 07:37 AM   #3
gspirito
Junior Member
 
Location: Italy

Join Date: Nov 2019
Posts: 2
Default

Thank you! That worked
gspirito is offline   Reply With Quote
Reply

Tags
adapters, bbduk, bbmap

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:22 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO