Hi, we are new to STAR alignment and have a general question about how one could evaluate the alignment quality.
We have been trying out STAR with some published data sets from SRA using the default settings. These are paired-end sequencing data generated by Smart-seq on FAC-sorted mouse cells, with the goal simply to look at differential gene expression across cell subtypes. The % of unmapped reads altogether (mismatches + too short + other) seems reasonable, being ~3% typically. However, the matrix we produced with quantmode does not match the corresponding one on GEO. Specifically, we failed to detect quite a number of transcripts. We understand that there are various parameters that we may change in STAR (which unfortunately are usually not mentioned in publications), but have no ideas about how to pick an optimal setting that makes biological and technical sense. More importantly, which number(s) in the result report should we care about when we are comparing two different STAR settings? Is there any reference that you may point us to?
Thanks!
We have been trying out STAR with some published data sets from SRA using the default settings. These are paired-end sequencing data generated by Smart-seq on FAC-sorted mouse cells, with the goal simply to look at differential gene expression across cell subtypes. The % of unmapped reads altogether (mismatches + too short + other) seems reasonable, being ~3% typically. However, the matrix we produced with quantmode does not match the corresponding one on GEO. Specifically, we failed to detect quite a number of transcripts. We understand that there are various parameters that we may change in STAR (which unfortunately are usually not mentioned in publications), but have no ideas about how to pick an optimal setting that makes biological and technical sense. More importantly, which number(s) in the result report should we care about when we are comparing two different STAR settings? Is there any reference that you may point us to?
Thanks!