SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Introducing BBDuk: Adapter/Quality Trimming and Filtering Brian Bushnell Bioinformatics 347 10-14-2021 01:02 AM
Comparing bbduk with fastp for quality and adapter trimming tamu_anand Bioinformatics 0 10-31-2019 03:12 PM
BBDuk java error when filtering using entropy? DrYak Bioinformatics 5 03-02-2018 06:25 AM
Primer filtering with bbduk / bbduk2.sh Latrunculia Illumina/Solexa 0 10-07-2016 05:33 AM
Discordant result between bowtie2 and sam filtering moistplus Bioinformatics 1 07-13-2016 09:30 AM

Reply
 
Thread Tools
Old 10-13-2021, 08:57 AM   #1
reliscu
Junior Member
 
Location: USA

Join Date: May 2021
Posts: 7
Default BBDuk quality filtering not producing expected result

I'm trying to trim/filter low quality reads from paired-end exome-seq data, using BBDuk.

I used the command:

```
for ea in $files;
do
R1="$ea"
R2=$(echo $R1 | sed "s/R1/R2/")
/home/shared/programs/bbmap/bbduk.sh -Xmx1g in1=$R1 in2=$R2 \
out1="$(echo $ea | sed s/.fastq.gz/_trimmed_filtered.fastq.gz/)" \
out2="$(echo $(echo $ea | sed s/R1/R2/) | sed s/.fastq.gz/_trimmed_filtered.fastq.gz/)" \
ref=/home/shared/programs/bbmap/resources/adapters.fa \
t=10 ktrim=r k=23 kmin=11 hdist=1 maq=10 minlen=60 tpe tbo
done;
```

After running fastqc on the output of this, I'm seeing that R2 files have some reads with low quality scores (see per sequence quality score), and the overrepresented sequence "NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN".

Looking at these reads in the fastq:
```
@HISEQ:525:HMFYNBCXX:1:1101:1380:2167 2:N:0:CAGATC
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@HISEQ:525:HMFYNBCXX:1:1101:1276:2219 2:N:0:CAGATC
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@HISEQ:525:HMFYNBCXX:1:1101:1238:2328 2:N:0:CAGATC
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
```

Shouldn't these reads have been filtered out?


Any help here would be much appreciated.

Last edited by reliscu; 10-13-2021 at 09:15 AM.
reliscu is offline   Reply With Quote
Reply

Tags
bbduk, exome sequencing, filtering reads

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:57 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO