Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Could anyone test and report problems of the tool I've been developing?

    Hi,

    I've been developing a sequencing error correction tool for Illumina datasets since 2013 named trowel. I have put statically-linking binaries for 64-bit linux in the repository.
    Could anyone let me know any problems while executing the binaries? Obviously, the current codes are working in our cluster setting. I would like to collect any problems in different server configurations and improve the compatibility of the tool. The code is not in public yet since some codes contain unpublished ideas.
    Thanks in advance.

    Sincerely,
    Euncheon Lim
    =============================
    PhD Student

    Max Planck Institute for Developmental Biology
    Department of Molecular Biology
    Spemannstraße 37-39,
    D-72076 Tuebingen, Germany
    Last edited by abysslover; 06-21-2015, 11:04 AM. Reason: wrong wording

  • #2
    I'm working on error correction at the moment, so I'll try it out tomorrow.

    Comment


    • #3
      Linux Mint 17 Qiana 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

      seems to work okay.

      Had chmod +x to run.


      You should move to github

      Comment


      • #4
        Thanks for your times.
        I mentioned about chmod command in the README file regarding Mr. Finney's comment.
        I am looking forward to seeing more comments.
        Last edited by abysslover; 06-22-2015, 02:33 AM.

        Comment


        • #5
          My results:

          Code:
          bushnell@gpint109:/global/scratch2/sd/bushnell/ecoli$ ./trowel
          Trowel: An error correction module for genomic FASTQ files(Ver. 0.2.0.0)
          =========================================================================
          Syntax:    trowel2 -f <file_list> [-k] [-m]
          Options:   -f STR  a single list file that contains a list of FASTQ files
                     -k INT  size of pre-mer [DEFAULT: 11] (11-15)
                     -m      fast less accurate mode [DEFAULT: slow accurate mode]
          
          Example of a file_list file:
                     a.thal.chr1.fq a.thal.chr2.fq a.thal.chr3.fq ...
                     h.sap.chr1.fq h.sap.chr2.fq h.sap.chr3.fq ...
                                  ...
          
          Note:      1. trowel only supports FASTQ files.
                     2. For datasets of high coverage or of large genome, you should use k of 13-15.
                     3. You have to run the following commands before running trowel
                        sudo ulimit -v unlimited
                        sudo ulimit -n 8192
          
          Contact:   Euncheon Lim <[email protected]>
          
          bushnell@gpint109:/global/scratch2/sd/bushnell/ecoli$ sudo ulimit -v unlimited
          -bash: sudo: command not found
          bushnell@gpint109:/global/scratch2/sd/bushnell/ecoli$ sudo ulimit -n 8192
          -bash: sudo: command not found
          bushnell@gpint109:/global/scratch2/sd/bushnell/ecoli$ memtime ./trowel -f 1snp.fq.gz
          Killed [4]
          0.00 user, 0.01 system, 1.40 elapsed -- Max VSize = 3724KB, Max RSS = 588KB
          Is sudo a default component of Debian?

          Comment


          • #6
            Thanks Mr. Bushnell.
            Actually, the maximum number of open files should be increased before running trowel. I think that you have specified a root password during installation. Thus, your package configuration does not contain "sudo" command.
            Trowel does not support compressed files to enable the parallelism. The -f parameter demands for a fastq-file list containing file names.
            e.g.)
            echo lsnp.fq > file_list && trowel -f file_list

            Thanks for your testing.

            Comment


            • #7
              Oops, my mistake. There's still some problem, though:

              Code:
              bushnell@mc1357:/global/scratch2/sd/bushnell/ecoli$ gunzip 1snp.fq.gz
              bushnell@mc1357:/global/scratch2/sd/bushnell/ecoli$ ./trowel -f 1snp.fq
              Illegal instruction (core dumped)
              bushnell@mc1357:/global/scratch2/sd/bushnell/ecoli$ echo 1snp.fq > file_list && ./trowel -f file_list
              Illegal instruction (core dumped)
              Note that this is from the binary I downloaded; I did not rebuild it. Should I rebuild it? It works fine for displaying help.

              Comment


              • #8
                Thanks, Mr. Bushnell,
                I have encountered this problem before. It was a compiler version problem.
                Could you try again with "trowel.0.2.0.2.linux.64" binary in the bin folder?
                Last edited by abysslover; 06-22-2015, 01:59 PM.

                Comment


                • #9
                  Same error still with 0.2.0.2.

                  Comment


                  • #10
                    I did overwrite the binary after building it on the debian.
                    Thanks Mr. Bushnell for testing.
                    Last edited by abysslover; 06-22-2015, 07:07 PM.

                    Comment


                    • #11
                      OK - the new version completed without crashing, but did not produce any output.

                      Code:
                      [BlockReader.set_k] K-mer size: 11
                      [BlockReader.calculate_all_file_sizes] Total: 0 bytes
                      [BlockReader.calculate_all_file_sizes] Ends at 2015-06-23.10:11:01, Real: 0.016 sec, CPU: 0.006 sec, RSS: 0.645(GB)/447.430(GB), CPU: 32
                      [BlockReader.create_all_blocks] starts at 2015-06-23.10:11:01
                      [BlockReader.create_all_blocks] Ends at 2015-06-23.10:11:01, Real: 0.000 sec, CPU: 0.000 sec, RSS: 0.645(GB)/447.430(GB), CPU: 32
                      [BlockReader.count_all_n_reads] starts at 2015-06-23.10:11:02
                      [BlockReader.count_all_n_reads] Ends at 2015-06-23.10:11:02, Real: 0.000 sec, CPU: 0.000 sec, RSS: 0.645(GB)/447.430(GB), CPU: 32
                      [BlockReader.create_all_read_ids] starts at 2015-06-23.10:11:02
                      [BlockReader.create_all_read_ids] Ends at 2015-06-23.10:11:02, Real: 0.001 sec, CPU: 0.003 sec, RSS: 0.645(GB)/447.430(GB), CPU: 32
                      [Correction.Main] Ends at 2015-06-23.10:11:02, Real: 58.783 sec, CPU: 57.369 sec, RSS: 0.645(GB)/447.430(GB), CPU: 32
                      Exit [0]
                      6.78 user, 54.93 system, 63.26 elapsed -- Max VSize = 744972KB, Max RSS = 675956KB

                      Comment


                      • #12
                        I am unsure why the size of total files is zero for your inputs. Have you used regular uncompress fastq files?

                        Comment


                        • #13
                          Yes... I used a 418MB fastq file with 1.2Mx150bp reads. They are synthetic reads, so the headers are not normal Illumina headers, but it's a perfectly valid fastq file. Does the tool need any specific characteristics of the input data?
                          Last edited by Brian Bushnell; 06-23-2015, 11:02 AM.

                          Comment


                          • #14
                            I got the problem. Trowel officially does not support fasta files.

                            Comment


                            • #15
                              Sorry, that was a typo on my part - it's actually fastq. Perhaps you could post a very small sample fastq that works correctly on your system?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Today, 08:47 AM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              60 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              57 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              53 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X