Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RepeatSeq for accurate genotyping of microsatellite repeats

    Published in NAR today, RepeatSeq is a tool for genotyping tandem repeats from sequencing data. RepeatSeq is stable, very easy to install and use, with very accurate results compared to existing tools and even in Sanger validation of our genotype calls.

    Open Article from NAR:


    GitHub Download:


    I am very happy to answer any questions in this thread or by private message/email ([email protected]). Any feedback is widely welcome, and is a critical part to RepeatSeq's development.

  • #2
    I am able to install and run RepeatSeq. However - how does one generate a regions file for input? It does not appear to be a simple BED file of repeat regions.

    Comment


    • #3
      I received a reply from Gareth Highnam via another channel and will repost here.

      Quote:

      chr:start-stop<tab>2.3_3_100_0_14_0_57_42_0_0.99_GCC

      2nd column string doesnt affect compute so use that

      End quote

      Comment


      • #4
        Hello,

        I have done all instructions in Repatseq website. But I can not run repatseq when I write repeatseq command?

        Also how can I prepare region file? Is it fasta file including region? There is an example above but I dont understand whats the 2.3_3_100_0_14_0_57_42_0_0.99_GCC ??

        Thanks

        BG

        Comment


        • #5
          more info needed

          The regions file is the reformated output from Gary Benson's Tandem Repeat Finder. You can find it here:



          You want to run trf using the -ngs and -h options and then parse the output. It gives you a output formatted like this:

          @chr1
          2 20 some other stuff
          100 120 some other stuff
          @chr2
          56 69 some other stuff
          etc.

          RepeatSeq only uses the chromosome, start, and end positions. So you need to reformat the file to look like this:

          [chrom]:[start]-[end]<tab>some_other_stuff

          example:
          chr1:2-20 some_other_stuff

          That should be easy, if it's not then PM me.

          I have no idea why you can't run repeatseq. If you post your command and the error it generates when you try and run it, perhaps I'll be able to help you.

          Comment


          • #6
            Thank you for your help

            I have download repeatseq, bamtools and fastahack from website with zip file.

            I have built repeatseq but when I write;

            [basalganglia@abc2 repeatseq-master]$ repeatseq
            -bash: repeatseq: command not found

            is written.

            I think I have problem with LD_LIB-PATH. I have used this command ;

            export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/basalganglia/repeatseq-master/bamtools

            Is it right or not ?

            Thanks again

            Comment


            • #7
              You probably need to add execute permissions to the executable. You can do that by
              Code:
              $ chmod u+x repeatseq
              Then you can run the application by

              Code:
              $ ./repeatseq
              
              or
              
              $ /path_to/repeatseq

              Comment


              • #8
                echo @GenoMax

                @GenoMax is right. repeatseq needs to be 1) executable and 2) on your $PATH when you try to run it.

                Code:
                chmod u+x
                makes a file executable.

                If you are in the same directory as repeatseq, you should then be able to just type repeatseq or ./repeatseq. If you're not, then you need to provide the /full/path/to/repeatseq when you run it.

                $PATH is the list of directories where your computer looks for programs to run. You can see what's currently in it by typing

                Code:
                echo $PATH
                To make your life easier you might want to make a directory where all your executables live. Then add that to your $PATH for all login shells - I think in ~/.bash_profile or some such place. Once you do that you don't need to remember/type full path names for your executables anymore.

                Comment


                • #9
                  Thank you for your help, but

                  I have got same problem again ;

                  $ chmod u+x repeatseq
                  $ ./repeatseq
                  ./repeatseq: error while loading shared libraries: libbamtools.so.2.3.0: cannot open shared object file: No such file or directory

                  Comment


                  • #10
                    Can you post the contents of the repeatseq makefile? The start of mine looks like this:

                    # $* is prefix shared by target and dependent; $@ is name of target file
                    CFLAGS = -c -O3 -Ibamtools/src
                    OBJS= repeatseq.o structures.o CLParse.o
                    NAME= repeatseq

                    $(NAME): $(OBJS)
                    g++ -o $@ $(OBJS) fastahack/Fasta.cpp fastahack/split.cpp -lpthread -lbamtools -Lbamtools/lib

                    It looks like you are setting LD_LIBRARY_PATH the right way from above, but I can tell you mine's not set at all and repeatseq still runs.

                    Also, what happens when you try
                    Code:
                    which bamtools
                    ?
                    Last edited by deyler; 05-28-2015, 12:33 PM.

                    Comment


                    • #11
                      Dear deyler,

                      Makefile including


                      # $* is prefix shared by target and dependent; $@ is name of target file
                      CFLAGS = -c -O3 -Ibamtools/src
                      OBJS= repeatseq.o structures.o CLParse.o
                      NAME= repeatseq

                      $(NAME): $(OBJS)
                      g++ -o $@ $(OBJS) fastahack/Fasta.cpp fastahack/split.cpp -lpthread -lbamtools -Lbamtools/lib

                      # Suffix rules: tell how to take file with first suffix and make it into
                      # file with second suffix

                      .cpp.o:
                      g++ $(CFLAGS) $*.cpp
                      clean:
                      rm *.o



                      When I wrote which bamtools;

                      repeatseq-master]$ which bamtools
                      /usr/bin/which: no bamtools in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/jdk1.7.0_71/bin:/home/basalganglia/bin)

                      Many thanks for your help
                      Last edited by basalganglia; 05-28-2015, 03:10 PM.

                      Comment


                      • #12
                        OK. This result means that your system cannot find your installation of bamtools. Let's fix that first, and hopefully the other problem goes away too.

                        The thing in which there is no bamtools is your PATH. I can tell you that, by looking at your PATH - it's the thing in parentheses. Any program that you want to be able to run just by typing its name needs to be located in a directory listed in the PATH. So you need to add the directory where bamtools lives to your PATH variable.

                        There are several ways of doing this.
                        1) the stupid way

                        $> PATH="/full/path_to/bamtools:$PATH"
                        $> export PATH

                        If you do it the stupid way, you'll have to do it again every time you start a new login shell.

                        2) the smarter way
                        Go to your home directory. Usually cd ~ will do this.
                        Then open .bash_rc or .bash_profile. Use whatever text editor you like. Note that neither file may exist before you do this as they are user-specific configuration files.
                        Now, put the two lines from #1 into the configuration file. Save it and go back to the terminal. Then run this command:
                        . ~/.bash_profile (or .bash_rc, whichever one you used)
                        or just log out and log back in again.

                        If you did it right, you should be able to type "echo $PATH" and see your PATH that includes the directory where bamtools lives. You should also be able to type "which bamtools" and have it spit out the full path to bamtools.

                        Note that you can do this for repeatseq too, so that you can run it from any directory within your system. All of my programs live in a deyler/bin directory, which is in my PATH - so from wherever I am, I can say "myprogram3" and it runs.

                        I'm going to be offline for the next 12-14 hours but I'll check this thread when I get back on.

                        Comment


                        • #13
                          I have same trouble again

                          repeatseq-master]$ chmod u+x repeatseq
                          repeatseq-master]$ ./repeatseq
                          ./repeatseq: error while loading shared libraries: libbamtools.so.2.3.0: cannot open shared object file: No such file or directory
                          repeatseq-master]$ PATH="/home/bg/repeatseq-master/bamtools:$PATH"
                          repeatseq-master]$ export PATH
                          repeatseq-master]$ which bamtools
                          /usr/bin/which: no bamtools in (/home/bg/repeatseq-master/bamtools:/home/bg/repeatseq-master/bamtools:/home/bg/repeatseq-master/bamtools:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/jdk1.7.0_71/bin:/home/bg/bin)

                          Comment


                          • #14
                            Do you see anything when you try following command:

                            Code:
                            $ ls -lh /home/bg/repeatseq-master/b*
                            It sounds like you could use some help with basic unix. It may be useful to spend a couple of hours going through this part: http://korflab.ucdavis.edu/Unix_and_...ent.html#part1

                            Did you compile repeatseq program yourself or download a pre-compiled binary?

                            Comment


                            • #15
                              /home/bg/repeatseq-master/bamtools:
                              total 36K
                              drwxrwsr-x 2 bg bg 4.0K May 28 00:08 bin
                              drwxrwsr-x 4 bg bg 4.0K May 28 00:06 build
                              -rw-rw-r-- 1 bg bg 1.8K Jul 30 2013 CMakeLists.txt
                              drwxrwsr-x 2 bg bg 4.0K Jul 30 2013 docs
                              drwxrwsr-x 4 bg bg 4.0K May 28 00:06 include
                              drwxrwsr-x 2 bg bg 4.0K May 28 00:07 lib
                              -rw-rw-r-- 1 bg bg 1.2K Jul 30 2013 LICENSE
                              -rw-rw-r-- 1 bg bg 2.0K Jul 30 2013 README
                              drwxrwsr-x 7 bg bg 4.0K Jul 30 2013 src

                              I have download from website then transfer to server with Filezilla

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 11:49 AM
                              0 responses
                              15 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-24-2024, 08:47 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              61 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              60 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X