Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #16
    Originally posted by lcollado View Post
    Hello,

    Does the git command above still work? I've tried it a few times today with no luck:

    Code:
    $ git clone git://dnaa.git.sourceforge.net/gitroot/dnaa/dnaa
    Initialized empty Git repository in /[my local path]/dnaa/.git/
    dnaa.git.sourceforge.net[0: 216.34.181.91]: errno=Connection timed out
    fatal: unable to connect a socket (Connection timed out)
    Thanks,
    Leonardo

    PS I'll try later from home as I guess that it could be a local network issue.
    Just tested, it works.

    Comment

    • kbushley
      Member
      • Jan 2010
      • 22

      #17
      Simulating longer PE Illumina reads,

      Hi. I finally found this after searching a bit...on the metasim website there are two configuration files for both 60 and 80 pb PE illumina reads...basically, these contain all the parameters for Illumina PE error models and you can upload it as a configuration file. Hope that helps.


      kathryn

      Comment

      • KevinLam
        Senior Member
        • Nov 2009
        • 204

        #18
        Originally posted by nilshomer View Post
        Ah, get the source code via git as there is not release yet:
        Code:
        git clone git://dnaa.git.sourceforge.net/gitroot/dnaa/dnaa
        Nils
        is it missing some files?
        Just did a git clone
        but I can't configure / make


        $ ./configure
        bash: ./configure: No such file or directory
        http://kevin-gattaca.blogspot.com/

        Comment

        • nilshomer
          Nils Homer
          • Nov 2008
          • 1283

          #19
          Originally posted by KevinLam View Post
          is it missing some files?
          Just did a git clone
          but I can't configure / make


          $ ./configure
          bash: ./configure: No such file or directory
          try this before configure:
          Code:
          sh autogen.sh
          I have updated the INSTALL to include this step. Thanks for spotting the poor documentation.

          Comment

          • KevinLam
            Senior Member
            • Nov 2009
            • 204

            #20
            Thanks Nils!
            Actually there's only one shell script so it's quite evident (my bad)
            anyway i ran that

            Code:
            sh autogen.sh 
            Preparing the dnaa build system...please wait
            
            ERROR:  Unable to locate GNU Autoconf.
            
            ERROR:  To prepare the dnaa build system from scratch,
                    at least version 2.52 of GNU Autoconf must be installed.
            
            
            autogen.sh does not need to be run on the same machine that will
            run configure or make.  Either the GNU Autotools will need to be installed
            or upgraded on this system, or autogen.sh must be run on the source
            code on another system and then transferred to here. -- Cheers!

            is it possible for you to include the autoconf files?
            I do not have that installed on my system
            http://kevin-gattaca.blogspot.com/

            Comment

            • nilshomer
              Nils Homer
              • Nov 2008
              • 1283

              #21
              Originally posted by KevinLam View Post
              Thanks Nils!
              Actually there's only one shell script so it's quite evident (my bad)
              anyway i ran that

              Code:
              sh autogen.sh 
              Preparing the dnaa build system...please wait
              
              ERROR:  Unable to locate GNU Autoconf.
              
              ERROR:  To prepare the dnaa build system from scratch,
                      at least version 2.52 of GNU Autoconf must be installed.
              
              
              autogen.sh does not need to be run on the same machine that will
              run configure or make.  Either the GNU Autotools will need to be installed
              or upgraded on this system, or autogen.sh must be run on the source
              code on another system and then transferred to here. -- Cheers!

              is it possible for you to include the autoconf files?
              I do not have that installed on my system
              Probably not the best idea to include autoconf with source code. If you mean the ./configure script, then it will be included in the releases (no release yet). You will have to either install the appropriate autoconf version or you can PM/email me and I would be happy to send you a tar-ball.

              Comment

              • plichel
                Junior Member
                • Mar 2010
                • 9

                #22
                To not to waste threads, does anybody know whether there is a read simulater that can sample also some known snps from, say, a dbsnp file or similar ?
                Thanks !

                Comment

                • srividya
                  Junior Member
                  • Sep 2010
                  • 6

                  #23
                  wgsim

                  Hello,

                  I am using wgsim to generate simulated reads of 76bp length(Solexa).

                  The fastq that is generated - Is it solexa fastq or sanger fastq ? Since there is no options to specify the fastq type required, I thought it to be Sanger. Is it correct?

                  Thanks,
                  Srividya

                  Comment

                  • lcollado
                    Member
                    • Jun 2009
                    • 65

                    #24
                    Hello srividya,

                    I don't know the answer, but you can find out using the ASCII table: http://es.wikipedia.org/wiki/ASCII

                    Solexa fastq (>= 1.3) won't have any values below 64. Meaning that numbers (48 to 57 in decimal ASCII) shouldn't appear in the quality lines of your fastq file.

                    Greetings,
                    Leonardo
                    L. Collado Torres, Ph.D. student in Biostatistics.

                    Comment

                    • srividya
                      Junior Member
                      • Sep 2010
                      • 6

                      #25
                      Hello,

                      Thanks for the reply , all the reads had a quality score of 2. Now, I can safely consider them to be Sanger.

                      Thanks,
                      Srividya

                      Comment

                      • lcollado
                        Member
                        • Jun 2009
                        • 65

                        #26
                        No problem and I'm glad you were able to solve your question

                        Leo
                        L. Collado Torres, Ph.D. student in Biostatistics.

                        Comment

                        • tldgID
                          Member
                          • May 2011
                          • 18

                          #27
                          Questions regarding synthetic data generation

                          Hi all,

                          I found this thread about generating synthetic reads for Illumina platform and since I need to generate such synthetic data, I post my question here (as opposed to creating a new thread!).

                          1) is it possible to generate SE reads and not PE?

                          2) does anyone know the advantage/disanvantages of “wgsim” from SAMTOOLs vs. “dwgsim” from the DNAA package? What has been modified in dwgsim? it is not very clear to me, since the README file of DNAA package says that:
                          “This is a fork of the SAMtools wgsim, since certain assumptions are made that we do not agree with.”
                          what are these assumptions? What has been modified? Is there any publication that elaborates these issues?

                          3) is there any statistical consideration involved in the generation of the reads? e.g. larger genes on the genome get more reads? Or is there any distribution-related consideration while sheering the reference genome? is the errors distributed uniformly in both software?

                          4) any other recommendations for synthetic data generation?

                          Thank you for any help in advance

                          Comment

                          • nilshomer
                            Nils Homer
                            • Nov 2008
                            • 1283

                            #28
                            Originally posted by tldgID View Post
                            Hi all,

                            I found this thread about generating synthetic reads for Illumina platform and since I need to generate such synthetic data, I post my question here (as opposed to creating a new thread!).

                            1) is it possible to generate SE reads and not PE?

                            2) does anyone know the advantage/disanvantages of “wgsim” from SAMTOOLs vs. “dwgsim” from the DNAA package? What has been modified in dwgsim? it is not very clear to me, since the README file of DNAA package says that:
                            “This is a fork of the SAMtools wgsim, since certain assumptions are made that we do not agree with.”
                            what are these assumptions? What has been modified? Is there any publication that elaborates these issues?

                            3) is there any statistical consideration involved in the generation of the reads? e.g. larger genes on the genome get more reads? Or is there any distribution-related consideration while sheering the reference genome? is the errors distributed uniformly in both software?

                            4) any other recommendations for synthetic data generation?

                            Thank you for any help in advance
                            1) Yes, specify "-2 0".
                            2) The fork was done to provide better color space (SOLiD) support, in particular to include the first color and adapter.
                            3) Random read placement, errors distributed according to the error rate.

                            Comment

                            • tldgID
                              Member
                              • May 2011
                              • 18

                              #29
                              Originally posted by nilshomer View Post
                              1) Yes, specify "-2 0".
                              2) The fork was done to provide better color space (SOLiD) support, in particular to include the first color and adapter.
                              3) Random read placement, errors distributed according to the error rate.

                              Thank you Nils!

                              About Q2: so, if I need Illumina-like synthetic data, it won't make a difference to use “wgsim” or “dwgsim”?

                              About Q3: can you elaborate more about “Random read placement”? My understanding is that the error rate is pre-specified, then when the reads are generated, in each position, the nt can be changed according to the error rate. Is this related to “Random read placement” or you meant something else?

                              Thanks again

                              Comment

                              • nilshomer
                                Nils Homer
                                • Nov 2008
                                • 1283

                                #30
                                Q2: there are a number of differences, including left-justification of indels and small bug fixes. You will notice differences and I encourage you test both out as I cannot predict all the differences.

                                Q3: a read's start position is randomly drawn from all possible start positions. Random errors are then introduced according to the per-base error rate.

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                19 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-26-2026, 10:12 AM
                                0 responses
                                31 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...