Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • featureCounts option question

    New to use featureCounts on RNA-seq analysis, my data is polyA enriched, stranded, single end Illumina reads.

    My goal is to do differential expression analysis between control and case groups. I plan to use DEseq2 to do the DE analysis after featureCounts.

    I have a few questions:

    1. I'm wondering if it's best to use -M −−fraction options or −−primary option or neither? I understand in ChIP-seq, people often only keep uniquely mapped reads, not sure about RNA-seq and also whether to only keep primary alignments. My feeling is that it's best to use --primary option.

    -M
    If specified, multi-mapping reads/fragments will be counted. A multi-mapping read will be counted up to N times if it has N reported mapping locations. The program uses the ‘NH’ tag to find multi-mapping reads.

    −−fraction
    If specified, a fractional count 1/n will be generated for each multi-mapping read, where n is the number of alignments (in- dicated by ‘NH’ tag) reported for the read. This option must be used together with the ‘-M’ option.
    −−primary
    If specified, only primary alignments will be counted. Primary and secondary alignments are identified using bit 0x100 in the Flag field of SAM/BAM files. All primary alignments in a dataset will be counted no matter they are from multi- mapping reads or not (ie. ‘-M’ is ignored).
    2. I read from many sources saying that it's normal to observe high level of duplicated reads for RNA-seq. So is it best not to use −−ignoreDup option?

    3. My current command line looks like this:

    Code:
    featureCounts -t exon -g gene_id -a genes.gtf -F GTF -o outfile.txt -s 1 −−primary input.bam
    Please let me know if there is some other options that I better use.

    Thanks!
    Last edited by gene_x; 08-09-2016, 09:44 AM.

  • #2
    How did you handle the multimappers in your alignment program? Did you use one of these options (for example this is what BBMap allows)

    Code:
    best    (use the first best site)
    toss    (consider unmapped)
    random  (select one top-scoring site randomly)
    all     (retain all top-scoring sites)

    Comment


    • #3
      Good point.

      I used hisat2 to do alignment and I think the default setting is -k option at

      -k <int>
      It searches for at most <int> distinct, primary alignments for each read. Primary alignments mean alignments whose alignment score is equal or higher than any other alignments.

      Default: 5 (HFM)
      Then I guess I don't really need --primary option here because all the reported alignments are primary.

      But still not sure if I should keep these multi-mapping reads at all. I read in a best practice paper saying tools including featureCounts often discard these multi-mapping reads whereas these newer ones (Sailfish/Salmon, Kallisto, RSEM) keep them.


      Originally posted by GenoMax View Post
      How did you handle the multimappers in your alignment program? Did you use one of these options (for example this is what BBMap allows)

      Code:
      best    (use the first best site)
      toss    (consider unmapped)
      random  (select one top-scoring site randomly)
      all     (retain all top-scoring sites)

      Comment


      • #4
        Having k set to 5 means you only count that many positions (even if there are more). Using "random" option with BBMap does not throw information away but does not overcount at the same time.

        If "mapping" (not precise) the reads is ok instead of alignment then the newer tools you mention are fast option.

        Comment


        • #5
          One clarification, in (classical) RNAseq multimappers are excluded (I'm counting Salmon/Kallisto/et al. as non-classical). In ChIPseq, primary alignments from multimappers are typically included.

          Comment


          • #6
            really? Could you provide a reference for the treatment of multimappers in ChIP-seq? To the contrary, I believe they are discarded and only uniquely mapped reads are kept.

            Originally posted by dpryan View Post
            One clarification, in (classical) RNAseq multimappers are excluded (I'm counting Salmon/Kallisto/et al. as non-classical). In ChIPseq, primary alignments from multimappers are typically included.

            Comment


            • #7
              I'll see if I can find a reference when I'm in the office tomorrow. Using only "unique alignments" prevents finding peaks in genes with upstream repeats (there are a number of them) and expressed repeats (we have a large group working on them).

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Advancing Precision Medicine for Rare Diseases in Children
                by seqadmin




                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                12-16-2024, 07:57 AM
              • seqadmin
                Recent Advances in Sequencing Technologies
                by seqadmin



                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                Long-Read Sequencing
                Long-read sequencing has seen remarkable advancements,...
                12-02-2024, 01:49 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 12-17-2024, 10:28 AM
              0 responses
              33 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-13-2024, 08:24 AM
              0 responses
              48 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-12-2024, 07:41 AM
              0 responses
              34 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-11-2024, 07:45 AM
              0 responses
              46 views
              0 likes
              Last Post seqadmin  
              Working...
              X