View Single Post
Old 02-24-2016, 04:45 AM   #6
Michael.Ante
Senior Member
 
Location: Vienna

Join Date: Oct 2011
Posts: 121
Default

Hi Alex,

the ERCC spike-ins do not contain any junctions. Thus, using TopHat2 solely on the ERCC- reference will cause some trouble. Either you need to combine your "host" annotation with the ERCC spike-in ones, or you run e.g. Bowtie2 on the ERCC sequences first and use the unmapped reads for the further analysis.

Moreover, I'd suggest to use the ERCC-Dashboard to have an overview how the ERCCs behave in your experiment.
IMHO, the ERCC transcripts are not reflecting the complexity of the transcriptome. This can be useful in case of controlling coverage, strandedness, and input/gene-read correlation. But they are not designed to control for different junction/PAS-usage, overlapping genes, SNP-detection, .....
You might have a look at https://www.biostars.org/p/170234/. The 5' ends are not described correctly in the provided annotation files; whilst the polyA sequence is included in the fasta.

tl;dr The ERCCs were designed for microarrays and can control nicely for a limited set of quality parameters. For normalising data in a higher complex sample space I would not use them.
Michael.Ante is offline   Reply With Quote