Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Countdata from samtools idxstats

    Hi

    I have my .bam files sorted and indexed for my samples. Later I used the samtools idxstats to obtain the number of mapped and unmapped reads.

    Can I take the reference_name column and No.of mapped reads column from all tab delim text files generated from samtools idxstats for all samples and create a count file and use it for subsequent analysis?

    or is there anyway to create count data file from samtools idxstatsa and use it in DESeq2?

  • #2
    Did you map to the transcriptome? In any case, DESeq2 needs counts of uniquely assignable mappings, so you'll want to prefilter things to remove multimappers.

    BTW, if you really did map to the transcriptome then you'll probably want to use something like RSEM or eXpress to get estimated counts (these can then be used with limma after processing with voom()).

    Comment


    • #3
      HI

      Yes I did mapping to Transcriptome. I used trinity perl packages for mapping and estimate the counts and also found Diff. expressed transcripts using both edgeR and DESeq2.

      Now I want do the analysis using normal approach, where I converted the SRA files into fastq files and later mapped it to transcriptome using bowtie2. Then I converted samfiles to bam files and later sorted it. Now I would like to generate a count table for all the samples

      Like you suggested Can I use RSEM or eXpress for generating counts? Also how to find the multimappers and how can I remove them. ?

      Why we have to use the voom() for estimated counts. I am new to RNA Seq, please guide me!

      Comment


      • #4
        Yes, you can use RSEM or eXpress to generate the counts. These both deal with multimappers in a proper way so you don't need to worry about that issue.

        Estimated counts aren't integers and their variance doesn't follow that expected for a negative binomial distribution (it would be unsurprising if the rounded count variance also didn't behave like unique count data for many gene). voom() can handle such data since it has no negative-binomial assumptions.

        BTW, in the likely event that the trinity pipeline produced fractional counts and you read to simply round those, please redo the analysis with limma and voom(). One should never round counts for edgeR/DESeq2.

        Comment


        • #5
          SO I can use RSEM-calculate expression on bam files and late input the .isoform results into RSEM-generate-data-matrix to get count.matrix (fragment raw counts) and TMM.matrix(normalised FPKM expression values). Then use voom() transformation from limma package to convert them into log-coounts and then later introduce them into DESEq2..did I got right?

          Comment


          • #6
            Close You will (or at least should) never use DESeq2 (or edgeR or DESeq) with this data. You will (or "should", if you prefer) use limma instead.

            Comment


            • #7
              I got your point. I will never use DESeq2 or edgeR when I use RSEM, rather I use limma classical approach.

              Just for information, the Trinity package use "align and estimate abundance.pl" which preps the reference and later aligns the fastq files with Transcriptome. The bam files generated can be directly fed into RSEM or eXpress to generate genes and isoforms.results. Then the "estimate_abundance.pl" is used to get generate the raw counts matrix and TMM normalized FPKM counts. Later the raw counts was then introduced into "run_DE_analysis..pl" choosing either "DESeq" or "edgeR" as options. But still I produces list diff. expressed transcripts.

              I tried once by converting the estimated count reads for samples into integers in R and later introduced in DESeq2...but I was not sure whether I ma doing it in right way ir not..

              Thanks mate..will try in limma method on RSEM count data.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 08:47 AM
              0 responses
              13 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              60 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              60 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              54 views
              0 likes
              Last Post seqadmin  
              Working...
              X