Typically one would normalize all of the samples together. You would also fit the entire dataset with a model rather than subsetting it (by year or treatment). While there are certainly occasions when this works poorly (generally when there's a large read number difference spread over a factor), it's generally the best course.
BTW, I hope you don't plan to run your own GLM. There are many prewritten tools, such as DESeq2 or edgeR that have additional features...and there's no point in reinventing the wheel.
|