Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by lcollado View Post
    Hello,

    Does the git command above still work? I've tried it a few times today with no luck:

    Code:
    $ git clone git://dnaa.git.sourceforge.net/gitroot/dnaa/dnaa
    Initialized empty Git repository in /[my local path]/dnaa/.git/
    dnaa.git.sourceforge.net[0: 216.34.181.91]: errno=Connection timed out
    fatal: unable to connect a socket (Connection timed out)
    Thanks,
    Leonardo

    PS I'll try later from home as I guess that it could be a local network issue.
    Just tested, it works.

    Comment


    • #17
      Simulating longer PE Illumina reads,

      Hi. I finally found this after searching a bit...on the metasim website there are two configuration files for both 60 and 80 pb PE illumina reads...basically, these contain all the parameters for Illumina PE error models and you can upload it as a configuration file. Hope that helps.


      kathryn

      Comment


      • #18
        Originally posted by nilshomer View Post
        Ah, get the source code via git as there is not release yet:
        Code:
        git clone git://dnaa.git.sourceforge.net/gitroot/dnaa/dnaa
        Nils
        is it missing some files?
        Just did a git clone
        but I can't configure / make


        $ ./configure
        bash: ./configure: No such file or directory
        http://kevin-gattaca.blogspot.com/

        Comment


        • #19
          Originally posted by KevinLam View Post
          is it missing some files?
          Just did a git clone
          but I can't configure / make


          $ ./configure
          bash: ./configure: No such file or directory
          try this before configure:
          Code:
          sh autogen.sh
          I have updated the INSTALL to include this step. Thanks for spotting the poor documentation.

          Comment


          • #20
            Thanks Nils!
            Actually there's only one shell script so it's quite evident (my bad)
            anyway i ran that

            Code:
            sh autogen.sh 
            Preparing the dnaa build system...please wait
            
            ERROR:  Unable to locate GNU Autoconf.
            
            ERROR:  To prepare the dnaa build system from scratch,
                    at least version 2.52 of GNU Autoconf must be installed.
            
            
            autogen.sh does not need to be run on the same machine that will
            run configure or make.  Either the GNU Autotools will need to be installed
            or upgraded on this system, or autogen.sh must be run on the source
            code on another system and then transferred to here. -- Cheers!

            is it possible for you to include the autoconf files?
            I do not have that installed on my system
            http://kevin-gattaca.blogspot.com/

            Comment


            • #21
              Originally posted by KevinLam View Post
              Thanks Nils!
              Actually there's only one shell script so it's quite evident (my bad)
              anyway i ran that

              Code:
              sh autogen.sh 
              Preparing the dnaa build system...please wait
              
              ERROR:  Unable to locate GNU Autoconf.
              
              ERROR:  To prepare the dnaa build system from scratch,
                      at least version 2.52 of GNU Autoconf must be installed.
              
              
              autogen.sh does not need to be run on the same machine that will
              run configure or make.  Either the GNU Autotools will need to be installed
              or upgraded on this system, or autogen.sh must be run on the source
              code on another system and then transferred to here. -- Cheers!

              is it possible for you to include the autoconf files?
              I do not have that installed on my system
              Probably not the best idea to include autoconf with source code. If you mean the ./configure script, then it will be included in the releases (no release yet). You will have to either install the appropriate autoconf version or you can PM/email me and I would be happy to send you a tar-ball.

              Comment


              • #22
                To not to waste threads, does anybody know whether there is a read simulater that can sample also some known snps from, say, a dbsnp file or similar ?
                Thanks !

                Comment


                • #23
                  wgsim

                  Hello,

                  I am using wgsim to generate simulated reads of 76bp length(Solexa).

                  The fastq that is generated - Is it solexa fastq or sanger fastq ? Since there is no options to specify the fastq type required, I thought it to be Sanger. Is it correct?

                  Thanks,
                  Srividya

                  Comment


                  • #24
                    Hello srividya,

                    I don't know the answer, but you can find out using the ASCII table: http://es.wikipedia.org/wiki/ASCII

                    Solexa fastq (>= 1.3) won't have any values below 64. Meaning that numbers (48 to 57 in decimal ASCII) shouldn't appear in the quality lines of your fastq file.

                    Greetings,
                    Leonardo
                    L. Collado Torres, Ph.D. student in Biostatistics.

                    Comment


                    • #25
                      Hello,

                      Thanks for the reply , all the reads had a quality score of 2. Now, I can safely consider them to be Sanger.

                      Thanks,
                      Srividya

                      Comment


                      • #26
                        No problem and I'm glad you were able to solve your question

                        Leo
                        L. Collado Torres, Ph.D. student in Biostatistics.

                        Comment


                        • #27
                          Questions regarding synthetic data generation

                          Hi all,

                          I found this thread about generating synthetic reads for Illumina platform and since I need to generate such synthetic data, I post my question here (as opposed to creating a new thread!).

                          1) is it possible to generate SE reads and not PE?

                          2) does anyone know the advantage/disanvantages of “wgsim” from SAMTOOLs vs. “dwgsim” from the DNAA package? What has been modified in dwgsim? it is not very clear to me, since the README file of DNAA package says that:
                          “This is a fork of the SAMtools wgsim, since certain assumptions are made that we do not agree with.”
                          what are these assumptions? What has been modified? Is there any publication that elaborates these issues?

                          3) is there any statistical consideration involved in the generation of the reads? e.g. larger genes on the genome get more reads? Or is there any distribution-related consideration while sheering the reference genome? is the errors distributed uniformly in both software?

                          4) any other recommendations for synthetic data generation?

                          Thank you for any help in advance

                          Comment


                          • #28
                            Originally posted by tldgID View Post
                            Hi all,

                            I found this thread about generating synthetic reads for Illumina platform and since I need to generate such synthetic data, I post my question here (as opposed to creating a new thread!).

                            1) is it possible to generate SE reads and not PE?

                            2) does anyone know the advantage/disanvantages of “wgsim” from SAMTOOLs vs. “dwgsim” from the DNAA package? What has been modified in dwgsim? it is not very clear to me, since the README file of DNAA package says that:
                            “This is a fork of the SAMtools wgsim, since certain assumptions are made that we do not agree with.”
                            what are these assumptions? What has been modified? Is there any publication that elaborates these issues?

                            3) is there any statistical consideration involved in the generation of the reads? e.g. larger genes on the genome get more reads? Or is there any distribution-related consideration while sheering the reference genome? is the errors distributed uniformly in both software?

                            4) any other recommendations for synthetic data generation?

                            Thank you for any help in advance
                            1) Yes, specify "-2 0".
                            2) The fork was done to provide better color space (SOLiD) support, in particular to include the first color and adapter.
                            3) Random read placement, errors distributed according to the error rate.

                            Comment


                            • #29
                              Originally posted by nilshomer View Post
                              1) Yes, specify "-2 0".
                              2) The fork was done to provide better color space (SOLiD) support, in particular to include the first color and adapter.
                              3) Random read placement, errors distributed according to the error rate.

                              Thank you Nils!

                              About Q2: so, if I need Illumina-like synthetic data, it won't make a difference to use “wgsim” or “dwgsim”?

                              About Q3: can you elaborate more about “Random read placement”? My understanding is that the error rate is pre-specified, then when the reads are generated, in each position, the nt can be changed according to the error rate. Is this related to “Random read placement” or you meant something else?

                              Thanks again

                              Comment


                              • #30
                                Q2: there are a number of differences, including left-justification of indels and small bug fixes. You will notice differences and I encourage you test both out as I cannot predict all the differences.

                                Q3: a read's start position is randomly drawn from all possible start positions. Random errors are then introduced according to the per-base error rate.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 03-27-2024, 06:37 PM
                                0 responses
                                12 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-27-2024, 06:07 PM
                                0 responses
                                11 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                53 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                68 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X