Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • sunhh
    Member
    • Jun 2012
    • 18

    How to construct a combined library for repeatmasker

    Thanks for your attention.
    I am constructing a repeat library for a genome sized ~970 Mb.
    Firstly I used repeatmodeler to generate a de novo repeat consensus library (libA.fas).
    At the same time, I used ltr_struc and ltr_finder to generate a LTR sequences library (libB.fas).
    Then I cat libA.fas, libB.fas, RepBase library and another library from MIPS to one file (LIB.fas).
    But I get a wired result.
    When I used LIB.fas as a input for "-lib" option of repeatmasker, I got 24.45 % region masked in the genome.
    While when I used libA.fas (output of repeatmodeler) as a input library, I got 47.78 % region masked.

    Can anyone tell me why I used a smaller library to get a larger repeat region masked?
    There are some parameters different between two runs, but I can not decide which one could cause this large difference.

    Thanks a lot!

    My command for repeatmasker is :
    for libA.fas:
    RepeatMasker -pa 10 genome.fa -no_is -nolow -norna -lib libA.fas

    For LIB.fas:
    RepeatMasker -lib database/LIB.fas -xsmall -no_is -nolow -pa 10 -frag 4000000 -a -gff genome.fa >Rmask_genome.out
  • sunhh
    Member
    • Jun 2012
    • 18

    #2
    Well, now the reason is found.
    I run another two test runs, the only difference of which is the parameter "-frag".
    The run without "-frag 4000000" assigned gave 45.60 % repeat region close to the expected.

    So in the future I will use "-frag" options carefully!

    ps: I did not check the script for that effection, though in the help document I cannot find a reason as the "-frag " is explained as Max limit, "Maximum sequence length masked without fragmenting".

    Comment

    • sunhh
      Member
      • Jun 2012
      • 18

      #3
      But there is still a question, why it does not matter when I set "-frag 4000000" with a library as small as 940 KB?
      I might check it in the future.

      Comment

      • amitbik
        Member
        • May 2013
        • 53

        #4
        Hi sunhh

        I have some problem in repeatmodeler and ltr_finder. Can you guide me how you construct library in repeatmodeler , ltr_struct and ltr_finder. From last 3 days ltr_finder is runnig but file size is not increasing. Plz guide me...

        Thanks...

        Comment

        • sunhh
          Member
          • Jun 2012
          • 18

          #5
          Originally posted by amitbik View Post
          Hi sunhh

          I have some problem in repeatmodeler and ltr_finder. Can you guide me how you construct library in repeatmodeler , ltr_struct and ltr_finder. From last 3 days ltr_finder is runnig but file size is not increasing. Plz guide me...

          Thanks...
          Hi amitbik,

          Could you show what problems you met? I simply followed the instruction of repeatmodeler and ltr_finder, and they works.
          I didn't use ltr_struct.

          Well, there is a small problem in repeatmodeler, where you need to correct the path for RECON in some file. And after I change -num_threads paramter of blastn from 4 to 30, the time used decreased to half.
          I cannot access my computing server now, maybe I can post more details later.

          Comment

          • amitbik
            Member
            • May 2013
            • 53

            #6
            Thank you.. sunhh for your reply..

            Actually I have installed repeatmodeler. But when i am building database it is showing error

            ./BuildDatabase -name test test.fa

            RepModelConfig.pm did not return a true value at ./BuildDatabase line 146.
            BEGIN failed--compilation aborted at ./BuildDatabase line 146.

            And one more thing RepModelConfig.pm file is empty.

            In ltr_finder i am giving this command and i am getting output like this

            ltr_finder -p 30 -w -C file.fa > ltr.fa

            output-

            Predict protein Domains 0.000 second
            >Sequence: Contig2 Len:9055
            No LTR Retrotransposons Found


            Do i have give my assembly file directly in repeatmodeler and ltr_finder or have to process some filteration?
            Last edited by amitbik; 02-05-2014, 10:21 PM.

            Comment

            • sunhh
              Member
              • Jun 2012
              • 18

              #7
              Originally posted by amitbik View Post
              Thank you.. sunhh for your reply..

              Actually I have installed repeatmodeler. But when i am building database it is showing error

              ./BuildDatabase -name test test.fa

              RepModelConfig.pm did not return a true value at ./BuildDatabase line 146.
              BEGIN failed--compilation aborted at ./BuildDatabase line 146.

              And one more thing RepModelConfig.pm file is empty.

              Do i have give my assembly file directly in repeatmodeler and ltr_finder or have to process some filteration?
              Hi,
              For building database, I think you might need to add "-engine ncbi" to the command, if your aligning engine is blast as me.

              And the error "line 146" should be the same problem of RepModelConfig.pm.
              That file should not be empty. I advise you to re-download the package and install it again.

              Comment

              • sunhh
                Member
                • Jun 2012
                • 18

                #8
                Originally posted by amitbik View Post
                Thank you.. sunhh for your reply..

                Actually I have installed repeatmodeler. But when i am building database it is showing error

                ./BuildDatabase -name test test.fa

                RepModelConfig.pm did not return a true value at ./BuildDatabase line 146.
                BEGIN failed--compilation aborted at ./BuildDatabase line 146.

                And one more thing RepModelConfig.pm file is empty.

                In ltr_finder i am giving this command and i am getting output like this

                ltr_finder -p 30 -w -C file.fa > ltr.fa

                output-

                Predict protein Domains 0.000 second
                >Sequence: Contig2 Len:9055
                No LTR Retrotransposons Found


                Do i have give my assembly file directly in repeatmodeler and ltr_finder or have to process some filteration?
                And for ltr_finder, I used a command like this:
                ltr_finder -w 0 -s ref_tRNAs.fa -a /path/to/ps_scan in_genome.fa 1>in_genome.fa.ltrF 2>in_genome.fa.ltrF.err

                It looks different from yours, especially "-w 0" parameter. I am not sure what "-C" means.

                Best

                Comment

                • amitbik
                  Member
                  • May 2013
                  • 53

                  #9
                  Originally posted by sunhh View Post
                  Hi,
                  For building database, I think you might need to add "-engine ncbi" to the command, if your aligning engine is blast as me.

                  And the error "line 146" should be the same problem of RepModelConfig.pm.
                  That file should not be empty. I advise you to re-download the package and install it again.
                  Before configure Repeatmodeler the RepModelConfig.pm file was not empty after i configure the Repeatemodeler and database the RepModelConfig.pm file became empty. When i start building the data base it is showing error.

                  Comment

                  • sunhh
                    Member
                    • Jun 2012
                    • 18

                    #10
                    Originally posted by amitbik View Post
                    Before configure Repeatmodeler the RepModelConfig.pm file was not empty after i configure the Repeatemodeler and database the RepModelConfig.pm file became empty. When i start building the data base it is showing error.
                    Please redo the configuration of Repeatmodeler. And record everything this time.

                    Comment

                    • amitbik
                      Member
                      • May 2013
                      • 53

                      #11
                      Originally posted by sunhh View Post
                      And for ltr_finder, I used a command like this:
                      ltr_finder -w 0 -s ref_tRNAs.fa -a /path/to/ps_scan in_genome.fa 1>in_genome.fa.ltrF 2>in_genome.fa.ltrF.err

                      It looks different from yours, especially "-w 0" parameter. I am not sure what "-C" means.

                      Best
                      By mistake i didn't put 0 in my command and "-C" is for delete highly repeat regions.
                      Can tell me you have given 3 files in_genome.fa, in_genome.fa.ltrF and in_genome.fa.ltrF.err
                      what are these files?

                      Comment

                      • sunhh
                        Member
                        • Jun 2012
                        • 18

                        #12
                        Originally posted by amitbik View Post
                        By mistake i didn't put 0 in my command and "-C" is for delete highly repeat regions.
                        Can tell me you have given 3 files in_genome.fa, in_genome.fa.ltrF and in_genome.fa.ltrF.err
                        what are these files?
                        Only in_genome.fa is an input file, and the rest are output files.

                        Comment

                        • amitbik
                          Member
                          • May 2013
                          • 53

                          #13
                          Thanks sunhh... for your help

                          My Repeatmodeler is working now. I can build data base now. This time i run Repeatmodeler from a different path and i change the path of Recon, Repeatscout...etc and it is working now.....

                          Comment

                          • amitbik
                            Member
                            • May 2013
                            • 53

                            #14
                            Hi sunhh,

                            I have some problem in ltr_finder i am using this command

                            ltr_finder -w 0 -s trna.fa -a ./ps_scan/ uni.fa > uni_ltr.txt

                            it run arround 16 hours and the two file uni.fa.ltrf and uni.fa.ltrf.err is empty. It also showed an error cannot find resonable bandwith: continue anyway.

                            Can you tell me why this error came and the two files are empty?

                            Thank you...

                            Comment

                            • amitbik
                              Member
                              • May 2013
                              • 53

                              #15
                              Can any one help me to find out the error.

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                                by SEQadmin2


                                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                                Here are nine questions we think about, in roughly the order they matter, before...
                                06-18-2026, 07:11 AM
                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-26-2026, 11:10 AM
                              0 responses
                              12 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-17-2026, 06:09 AM
                              0 responses
                              48 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-09-2026, 11:58 AM
                              0 responses
                              106 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              125 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...