Hi, you've been so helpful, thanks.
So, the adapter thing was what I suspected, and I think it's sorted. However, the barcodes are an unexpected problem. When I test the adapter sequences, even when using substrings, there are none in the trimmed reads. But there are thousands of barcodes (5687733).
The command line for trimmomatic is a mess, but here's what I used:
java -classpath trimmomatic-0.22.jar org.usadellab.trimmomatic.TrimmomaticPE readsR1.fastq readsR2.fastq forward_paired.fastq forward_unpaired.fastq reverse_paired.fastq reverse_unpaired.fastq CROP:90 MINLEN:90
I didn't use ILLUMINACLIP because I first used IlluQC from NGSToolkit which was supposed to get rid of that. From what I understand, I need to run trimmomatic again with other settings, but I'm worrying about something. Apparently, cutting the last 10 bases is not enough for 5687733 reads. So, if I want to keep the 90 read length, I'll loose all of these reads after it cuts off the barcodes. Maybe I should use a lower minimum length? It's a bit of a dilemma, because my coverage isn't that great...
(Here's a really basic question: how do I create a fasta file with the adapter and barcode sequences to use with ILLUMINACLIP? Or is there a file I can download with TruSeq adapters and barcodes?)
Thanks
Sandra
So, the adapter thing was what I suspected, and I think it's sorted. However, the barcodes are an unexpected problem. When I test the adapter sequences, even when using substrings, there are none in the trimmed reads. But there are thousands of barcodes (5687733).
The command line for trimmomatic is a mess, but here's what I used:
java -classpath trimmomatic-0.22.jar org.usadellab.trimmomatic.TrimmomaticPE readsR1.fastq readsR2.fastq forward_paired.fastq forward_unpaired.fastq reverse_paired.fastq reverse_unpaired.fastq CROP:90 MINLEN:90
I didn't use ILLUMINACLIP because I first used IlluQC from NGSToolkit which was supposed to get rid of that. From what I understand, I need to run trimmomatic again with other settings, but I'm worrying about something. Apparently, cutting the last 10 bases is not enough for 5687733 reads. So, if I want to keep the 90 read length, I'll loose all of these reads after it cuts off the barcodes. Maybe I should use a lower minimum length? It's a bit of a dilemma, because my coverage isn't that great...
(Here's a really basic question: how do I create a fasta file with the adapter and barcode sequences to use with ILLUMINACLIP? Or is there a file I can download with TruSeq adapters and barcodes?)
Thanks
Sandra
Comment