Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • problem with miRTRAP

    I have a simple question with the usage of miRTRAP. I couldn't find others on the internet facing the same problem too. Hope someone could help.

    My recent work is to discover novel miR of mouse by miRTRAP. I've checked the "Usage Table of Contents" on the miRTRAP website, but still got some problem with the input data. From the website, the "config.txt" and reads.txt" should be prepared. And the "config.txt" should include the following input data:

    1. "readListFile" - the aligned data in gff format (I've changed mine from Soap2 output to gff format)
    2. "genomeFile" - the whole mouse genome in fasta format
    3. "repeatRegionsFile" - What's the difference from genomeFile? (With mask?)

    My first trial was to ignore the "repeatRegionsFile", but the output files of command "printReadRegions.pl config.txt" are all 0kb.

    I guess there might be some mistakes in my understanding.
    Could anyone help me?

    Thanks a lot!

  • #2
    hi,Yushan Hsiao
    I would like to ask you if the problem have been solved,Whether the process has been smooth,If you are being studied the mirna,Do you have any good software about mirtron (one format of mirna) to find.In this process, I encountered some difficulties,Hope for your help.thanks very much.

    Chong Chen

    Comment


    • #3
      up

      I've the same question .

      Comment


      • #4
        miRTRAP question

        Hi everyone, this is Dave Hendrix. miRTRAP is my software, and I am happy to answer any questions. You can email me directly (my email is in the manuscript as a corresponding author). A description of the steps of miRTRAP is at:



        These instructions have been updated to add more clarity. You can also download a more up-to-date version of the software.

        In general, there should be error messages printed out if things don't work with the program. You can post those messages to this thread for more detail. I will attempt to answer these questions one-by-one.

        1. "readListFile" - the aligned data in gff format (I've changed mine from Soap2 output to gff format)

        The readListFile is a tab separated list of files, with a label and the file name, like this:

        tissue1 tissue1_reads.gff
        tissue2 tissue2_reads.gff
        tissue3 tissue3_reads.gff

        where the reads are a size-selected (around 17-25nt) sequencing data in gff format. The file names require a full path to the file if it is not in the directory that you are running the scripts from.

        3. "repeatRegionsFile" - What's the difference from genomeFile? (With mask?)

        The genome file is the actual fasta file of the genome. Each chromosome/scaffold should be a separate entry of the fasta file. The repeatRegionsFile is a list of the genomic coordinates in the form (chrom start stop) separated by tabs as in:

        Scaffold_1631 1739 1818
        Scaffold_1631 2189 2258
        Scaffold_1631 4125 4178
        Scaffold_1631 4369 4415
        Scaffold_1631 4505 4588


        Please send any other questions my way as I am interested in improving the explanation of the software. Also, in general it doesn't hurt to look at the main perl module miRTRAP.pm and reading through it to become more familiar with how it reads in files and processes them. Best wishes and good luck on your search for microRNAs.

        Dave

        Comment


        • #5
          there are several tool to predict miRs such as miRDeep, MIReNA. what is the advantages for different miR prediction tools?

          Yu

          Comment


          • #6
            Originally posted by jay2008 View Post
            there are several tool to predict miRs such as miRDeep, MIReNA. what is the advantages for different miR prediction tools?

            Yu
            There is a new updated version of miRDeep called miRDeep2 that you should try. This is probably the most recent piece of software of this type.

            I will say that miRTRAP takes into account a lot of information. It is necessary for you to align the reads allowing a lot of hits to the genome for each read, because the program takes this information into account in its prediction. Loci with reads that have a lot of hits to other places in the genome (greater than the maxHit parameter) are excluded. Furthermore, loci that are surrounded by such repetitive small RNAs are also filtered out. In my experience, miRDeep has very few false negatives, but some false positives. miRDeep has very few false positives, but some false negatives. Depending on your purposes and your available data, either could work.

            Another drawback is that miRTRAP takes a lot of RAM, and for large genomes it may require you to split it up into chromosomes.

            Comment


            • #7
              I am tring to use miRTRAP. when I set "repeatRegionsFile" to an empty file. I got error :
              could not open 16714.
              is "repeatRegionsFile" necessary? for human genome, how can I get repeatRegionsFile?

              thanks
              Yu

              Comment


              • #8
                It looks like for some reason, it thinks your repeat regions file is given by the number "16714". Can you paste some of your config file?

                It isn't 100% necessary to filter out repeat regions, but I would strongly recommend it to avoid false positives. You can get the data for this at UCSC for example here:



                or whatever works best for your preferred version of the genome. You may look to filter out simple repeats and transposon-associated repeats. The format for the repeat region file is just a simple tab delimited file of chrom start stop:

                <chrom> <start> <stop>

                so you could map the repeat data from UCSC to such a format with a simple perl script.

                Comment


                • #9
                  Hi,
                  I don't understand how to convert aligned reads info. (mine is by bowtie, which format should i use?) into gff format and cannot proceed the downstream scripts.

                  And i cannot produce soap2 output, it skips all reads shorter than 28nt..

                  Would anyone give some help?

                  Thanks very much!


                  Franklin
                  Last edited by cwn5810; 12-21-2012, 01:07 AM.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Advancing Precision Medicine for Rare Diseases in Children
                    by seqadmin




                    Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                    12-16-2024, 07:57 AM
                  • seqadmin
                    Recent Advances in Sequencing Technologies
                    by seqadmin



                    Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                    Long-Read Sequencing
                    Long-read sequencing has seen remarkable advancements,...
                    12-02-2024, 01:49 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 12-17-2024, 10:28 AM
                  0 responses
                  22 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 12-13-2024, 08:24 AM
                  0 responses
                  42 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 12-12-2024, 07:41 AM
                  0 responses
                  28 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 12-11-2024, 07:45 AM
                  0 responses
                  42 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X