I downloaded some raw RNA-seq reads from NCBI using the sra-toolkit (Accession numbers: SRR2053159-64). The reads were generated using ABI SOLiD, and in colour space. I downloaded them in base space in fastq format using
But after that, when I am looking at the data after running FastQC, I find that the forward read length is 50 while the reverse reads length is 35. In one previous thread here in SEQanswers, I found that it is normal for ABI SOLiD to generate paired end reads with different lengths for forward and reverse reads? This is causing all the reverse reads to fail the filtering criteria set in Trimmomatic
What is the correct way to work around this?
Actually, I am trying to compare the transcriptomes of different bacteria under a certain stress using raw reads from NCBI. While for the rest of the set of bacteria the sequencing was done using either Illumina or Solexa, this one was done using ABI SOLiD. Is it a good idea to include this ABI SOLiD data along with the others for analysis?
Code:
fastq-dump -B SRR2053159
Code:
nohup java -jar ~/tools/Trimmomatic-0.39/trimmomatic-0.39.jar PE $1 $2 $1.trim.fil.pair.gz $1.trim.fil.unpair.gz $2.trim.fil.pair.gz $2.trim.fil.unpair.gz LEADING:20 TRAILING:20 AVGQUAL:25 SLIDINGWINDOW:10:30 MINLEN:36 > script.trimmo.PE.sh
Actually, I am trying to compare the transcriptomes of different bacteria under a certain stress using raw reads from NCBI. While for the rest of the set of bacteria the sequencing was done using either Illumina or Solexa, this one was done using ABI SOLiD. Is it a good idea to include this ABI SOLiD data along with the others for analysis?