You could do some quality checks and remove duplicates and recalibrate; but if you are just starting off, just go ahead and assume the data is good and try and see if you can get a report on variations using a commonly used tool. After you get that working, you can check out the other stuff and re-run a new workflow. Order of importance would be : QC, duplicate removal or marking , then re calibration. Many here at work say recalibration is not worth the effort. Theoretically marking or removing duplicates may be important upstream to a variant caller, and again, theoretically a variant caller could do that work for you so read the docs for your tools and follow their advice.
Do try and run a QC tool on your fastqs early.
Absolute easiest QC on bam files is "samtools flagstat" (google it).
Study the results and see if you can make sense of them.
|