Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • miRDeep2 can not detect the reads number accurately

    Hi all,
    I am using miRDeep2 to analyse small RNA sequencing data. I noticed that when one read can match to two or more known miRNAs, the software will report all of them. For example, for miR-146a, miR-146b, which are in the same family, they have similar sequence, many reads can map to both of them. This will increase the total reads number.
    Can anyone help me?
    Many thanks.

  • #2
    anyone can give suggestions?
    Thank you.

    Comment


    • #3
      What software

      Which software are you using - commercial?
      Honey

      Comment


      • #4
        Originally posted by honey View Post
        Which software are you using - commercial?
        Honey
        Not commercial software. mirdeep software created by MDC. illumina sequencing data.
        guanzhidao

        Comment


        • #5
          Hi guan,

          Which step of the pipeline are you running?

          For example with the quantifier.pl step, you could increase the stringency of the mapping requirements with the -g, -e, and -f options. Dropping g to zero might be enough. However, there will be cost with that, as now reads with a single sequencing error will not map to a miRNA.

          Comment


          • #6
            Originally posted by Wallysb01 View Post
            Hi guan,

            Which step of the pipeline are you running?

            For example with the quantifier.pl step, you could increase the stringency of the mapping requirements with the -g, -e, and -f options. Dropping g to zero might be enough. However, there will be cost with that, as now reads with a single sequencing error will not map to a miRNA.
            Hi Wallysb01,
            Thanks for the reply.
            Did you notice that even if we set option -g to zero, we can still find some reads can be mapped to different miRs. I often check the PDF file that miRDeep2 made. For example, in cattle miRNAs, miR-103 and miR-107 only have one mismatch in the 5' terminals. Many reads can be mapped to both of them. If we got expression level from this result, it isn't accurate.What do you think?
            Thanks so much.
            guanzhidao

            Comment


            • #7
              Hi Guan,

              Yes, I have noticed that this -g option will not work for all miRNAs because sequence similarity of the mature miRNA is identical in some cases. I suppose you could use the -W for the weighting of .5 for reads mapping twice. But, for input into something like DESeq that will not help, since counts have to whole numbers and you'll certainly end up with some fractions.

              I don't think there is much else you can do. Its just what happens with things that are 18 bps, I guess... The only other thing I can think of to avoid this is to create a gff3/gtf file of novel/known miRNAs from the wrapper, then do alignments to the whole genome, excluding multi-mapping reads while forcing identical matches, and quantification of expression with cufflinks. I don't know if anyone has tried this with miRNAs though, nor how valid it might be given that many of those short 18-22bp reads may multi-map to the genome, even if they only map to one miRNA.

              I think this is just what we have to live with given how short the sequences are.
              Last edited by Wallysb01; 10-22-2012, 10:22 PM.

              Comment


              • #8
                Originally posted by Wallysb01 View Post
                Hi Guan,

                Yes, I have noticed that this -g option will not work for all miRNAs because sequence similarity of the mature miRNA is identical in some cases. I suppose you could use the -W for the weighting of .5 for reads mapping twice. But, for input into something like DESeq that will not help, since counts have to whole numbers and you'll certainly end up with some fractions.

                I don't think there is much else you can do. Its just what happens with things that are 18 bps, I guess... The only other thing I can think of to avoid this is to create a gff3/gtf file of novel/known miRNAs from the wrapper, then do alignments to the whole genome, excluding multi-mapping reads while forcing identical matches, and quantification of expression with cufflinks. I don't know if anyone has tried this with miRNAs though, nor how valid it might be given that many of those short 18-22bp reads may multi-map to the genome, even if they only map to one miRNA.

                I think this is just what we have to live with given how short the sequences are.

                hi Wallysb01,
                I am so happy to talk with you. You are so smart about this.
                Thanks a lot.
                I did not update my miRDeep2, so did not notice that -W option in the new version.
                I will try that.
                I am trying another tool "miRanalyzer" to see what kind of result it will give.
                Anyway, miRDeep is quite a amazing one.
                Guanzhidao

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM
                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 08:47 AM
                0 responses
                14 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                60 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                60 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                54 views
                0 likes
                Last Post seqadmin  
                Working...
                X