View Single Post
Old 11-08-2019, 02:53 AM   #1
gspirito
Junior Member
 
Location: Italy

Join Date: Nov 2019
Posts: 2
Default Extract reads from paired-end fastq based on specific adapters with bbduk

Hello everyone, I am using bbduk.sh (from bbmap toolkit) to extract reads from paired-end fastq files based on the presence of specific adapters in the 5' of the sequence in the "_1" fastq file.

I am using this command:

Code:
./bbmap/bbduk.sh -Xmx1g in1=reads_1.fastq.gz in2=reads_2.fastq.gz outm1=matched1.fastq.gz outm2=matched2.fastq.gz literal=AAACCTGAGAAACCTA k=16 hdist=0 -rcomp=f
The problem is that other that the correct reads, the output file contains also other reads which do not include the adapter sequence, es:

# from reads_1.fastq.gz
@SRR9262917.232075 232075/1
GCATGCGAGTAGCGGTGGTTCTTATA
+
FFFFFFFFFFFFFFFFFFFFFFFFFF

# from reads_2.fastq.gz
@SRR9262917.232075 232075/2
AAGCAGTGGTATCAACGCAGAGTACATGGGATTCCATAGCCCTGTGGTTTTTATAGATCTTGTAAACCCCAAACCTGGGAAACCTAGTGGC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFF,FFFFFFFFF

Does anyone know why this may be happening and how to avoid this?

Thanks in advance.
gspirito is offline   Reply With Quote