Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by NextGenSeb View Post
    Hi Nils,
    just stumbled across dwgsim in my search for a read simulator, love the fact that I can generate Illumina, Solid and Ion-torrent on the same tool as I have data from all platforms. Nice touch as well on including the galaxy wrappers
    At the risk of sounding ungrateful, is there an ETA on the dwgsim_pileup_eval.pl? You'd solve all my problems in one go.
    Thanks for sharing,
    Cheers
    Seb
    No, I haven't thought about when/if I am going to update it. It works fine on samtools pileup data, but now needs VCF support. I'd also like to re-write it in python to improve maintainability. Nevertheless, I am always be happy to accept patches or contributions, as your free time is as free as mine .

    Comment


    • #32
      Fair enough, maybe that gives me teh motivation to finally improve my almost non-existent python skills

      Got a few more questions fo ryou though:

      1. The "bed-like" format for candidate mutations (-b option), what is that supposed to look like? From the header files the first three columns are pretty clear, but I am stuck with regards to the mutation type and the rest of the file

      2. Still in reagards to mutations, is there a way to simulate specific mutation frequencies? I would like like to test aligners and variant callers for somatic mutations and hence need a degree of frequency control over the inserted mutations.

      3. The IonTorrent homopolymer errors (elegedly) got better with the new chemistry. Is there a way to potentially adjust the error frequencies in the program for this?

      Thanks for the quick reply and your comments.

      Cheers
      Seb

      Comment


      • #33
        Originally posted by NextGenSeb View Post
        Fair enough, maybe that gives me teh motivation to finally improve my almost non-existent python skills

        Got a few more questions fo ryou though:
        I found those skills come in handy!

        Originally posted by NextGenSeb View Post
        1. The "bed-like" format for candidate mutations (-b option), what is that supposed to look like? From the header files the first three columns are pretty clear, but I am stuck with regards to the mutation type and the rest of the file
        Use the VCF support in the latest GIT code. If you still want to use the bed-like format, the source code is the documentation!

        Originally posted by NextGenSeb View Post
        2. Still in reagards to mutations, is there a way to simulate specific mutation frequencies? I would like like to test aligners and variant callers for somatic mutations and hence need a degree of frequency control over the inserted mutations.
        None yet, but I don't see why that couldn't be added (search for instances of 0.5 in the code).

        Originally posted by NextGenSeb View Post
        3. The IonTorrent homopolymer errors (elegedly) got better with the new chemistry. Is there a way to potentially adjust the error frequencies in the program for this?
        Seb
        Yes, see the "-e/-E" options. I like the "elegedly" .

        Comment


        • #34
          Great, thanks again for your help. I'll give it a go and let you know the outcome. Once I have results from the new chemistry I'' try to remember to put that up as well.

          Cheers
          Seb

          Comment


          • #35
            Hi Nils,

            what parameters are used for placing a read under 'mi' category?

            After observing the results, not all the reads which map at different co-ordinate than simulated position are categorized under 'mi'.
            Regards,
            Chintan Vora

            Comment


            • #36
              Originally posted by chintanspy View Post
              Hi Nils,

              what parameters are used for placing a read under 'mi' category?

              After observing the results, not all the reads which map at different co-ordinate than simulated position are categorized under 'mi'.
              From the header: "# mi' | the number of reads mapped incorrectly that should be mapped be mapped at or greater than that threshold".

              It can also be affected by the "-g" parameter.

              Comment


              • #37
                Originally posted by nilshomer View Post
                It can also be affected by the "-g" parameter.
                I could not find " -g" parameter. Am I looking at wrong patch (nh13-DWGSIM-2fc8222) ?
                Regards,
                Chintan Vora

                Comment


                • #38
                  Originally posted by chintanspy View Post
                  I could not find " -g" parameter. Am I looking at wrong patch (nh13-DWGSIM-2fc8222) ?
                  See "dwgsim_eval".

                  Comment


                  • #39
                    I was confused by "-d" opinition
                    NB: the -d option was previously incorrectly stated as being the outer distance, but is in fact the inner distance.

                    for a PE-read

                    5--->3______3<---5

                    "-d" means for distance between 3' and 3' of the PE reads?

                    Comment


                    • #40
                      Originally posted by plantae View Post
                      I was confused by "-d" opinition
                      NB: the -d option was previously incorrectly stated as being the outer distance, but is in fact the inner distance.

                      for a PE-read

                      5--->3______3<---5

                      "-d" means for distance between 3' and 3' of the PE reads?
                      See the help message for the latest version.

                      Comment


                      • #41
                        Are you planning to publish this? We just cited it in a paper we submitted, but since there was no paper we cited the link to the software....

                        Comment


                        • #42
                          Originally posted by nilshomer View Post
                          See the help message for the latest version.
                          I have intalled version 0.1.10, the help message did not clear specify the "-d" opinion

                          -d INT inner distance between the two ends [500]

                          "the two ends"
                          means for two pair-end reads?
                          or 3' end of two pair-end reads?
                          or 5' end of two pair-end reads?

                          Comment


                          • #43
                            I have used dwgsim 0.1.10 to generate reads from hg19 chromosome 22, and tested the sensitivities of variant callers SAMtools, GATK, and glfSingle.
                            The default per base error rate 0.02 is used.
                            Even up to 100X ('-C 100' in simulation), GATK and glfSingle can identify only 86% the the variants. But SAMtools can call 95%.
                            Shall I reduce the error rate?

                            Comment


                            • #44
                              Please post your questions in a new thread, thanks!

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin


                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                                Yesterday, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              39 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              41 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              35 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              55 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X