I've looked at about a hundred RNA- and ChIP-seq samples from HiSeq machines (from our lab, GEO, or ENCODE) over the last few years and within each sample every read length is always the same (ie. they're all 50 or 100 or whatever). I just looked at some data from a sequencing facility that's using a NextSeq, it's RNA-seq single-end 151 bp, but FastQC shows the sequences range from 35-151 (vast majority are 151). I asked the sequencing facility and they said this is normal, that the shorter reads are due to a combination of some fragments being too small and from adapter trimming. Does this sound right, is it anything to be concerned about? Are they doing or not doing some filtering or read processing that's not been done to most other sequencing data? Can most modern aligners deal with variable read lengths (currently using Bowtie2 and STAR)?
I've found some related posts, they both mention 35 as the minimum length so I'm guessing that's not a coincidence:
I've found some related posts, they both mention 35 as the minimum length so I'm guessing that's not a coincidence:
Comment