New to use featureCounts on RNA-seq analysis, my data is polyA enriched, stranded, single end Illumina reads.
My goal is to do differential expression analysis between control and case groups. I plan to use DEseq2 to do the DE analysis after featureCounts.
I have a few questions:
1. I'm wondering if it's best to use -M −−fraction options or −−primary option or neither? I understand in ChIP-seq, people often only keep uniquely mapped reads, not sure about RNA-seq and also whether to only keep primary alignments. My feeling is that it's best to use --primary option.
2. I read from many sources saying that it's normal to observe high level of duplicated reads for RNA-seq. So is it best not to use −−ignoreDup option?
3. My current command line looks like this:
Please let me know if there is some other options that I better use.
Thanks!
My goal is to do differential expression analysis between control and case groups. I plan to use DEseq2 to do the DE analysis after featureCounts.
I have a few questions:
1. I'm wondering if it's best to use -M −−fraction options or −−primary option or neither? I understand in ChIP-seq, people often only keep uniquely mapped reads, not sure about RNA-seq and also whether to only keep primary alignments. My feeling is that it's best to use --primary option.
-M
If specified, multi-mapping reads/fragments will be counted. A multi-mapping read will be counted up to N times if it has N reported mapping locations. The program uses the ‘NH’ tag to find multi-mapping reads.
−−fraction
If specified, a fractional count 1/n will be generated for each multi-mapping read, where n is the number of alignments (in- dicated by ‘NH’ tag) reported for the read. This option must be used together with the ‘-M’ option.
If specified, multi-mapping reads/fragments will be counted. A multi-mapping read will be counted up to N times if it has N reported mapping locations. The program uses the ‘NH’ tag to find multi-mapping reads.
−−fraction
If specified, a fractional count 1/n will be generated for each multi-mapping read, where n is the number of alignments (in- dicated by ‘NH’ tag) reported for the read. This option must be used together with the ‘-M’ option.
−−primary
If specified, only primary alignments will be counted. Primary and secondary alignments are identified using bit 0x100 in the Flag field of SAM/BAM files. All primary alignments in a dataset will be counted no matter they are from multi- mapping reads or not (ie. ‘-M’ is ignored).
If specified, only primary alignments will be counted. Primary and secondary alignments are identified using bit 0x100 in the Flag field of SAM/BAM files. All primary alignments in a dataset will be counted no matter they are from multi- mapping reads or not (ie. ‘-M’ is ignored).
3. My current command line looks like this:
Code:
featureCounts -t exon -g gene_id -a genes.gtf -F GTF -o outfile.txt -s 1 −−primary input.bam
Thanks!
Comment