Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • dwgsim -> readnames and quality scores

    Hi,

    I am using dwgsim from the dnaa package to simulate short read pairs from a reference genome.

    What I eventually want to do is assess different aligners on the basis of the reads that align to the right position.

    1.

    I was looking at the documentation on http://sourceforge.net/apps/mediawik...ome_Simulation

    and I'm trying to understand what my readname means, it is:

    Chr1_1514706_1515213_1:1:0_1:0:0_0/1

    So,

    Chr1 = contig name
    1514213 = start position 1 (from the mutated reference?)
    1515213 = start position 2

    and then I don't get the rest..?

    2.

    Are reads in _1.fq 5' reads and those in _2.fq 3' reads?

    3.

    What does this line in the output mean?

    chr1 745 A R +

    OR

    chr1 15454 - A +

    OR

    chr1 87846 T C -

    4.

    My quality scores are all '1'?

    5.

    I would like to remove all errors caused by the actual sequencer/technology completely. Would that mean I need to make -e and -E = 0?

    Cheers!~

  • #2
    Originally posted by genome View Post
    Hi,
    1.

    I was looking at the documentation on http://sourceforge.net/apps/mediawik...ome_Simulation

    and I'm trying to understand what my readname means, it is:

    Chr1_1514706_1515213_1:1:0_1:0:0_0/1

    So,

    Chr1 = contig name
    1514213 = start position 1 (from the mutated reference?)
    1515213 = start position 2

    and then I don't get the rest..?
    Are you using the version from the git repository and latest commit? After you update, look at the section "read names explained".

    Originally posted by genome View Post
    2.

    Are reads in _1.fq 5' reads and those in _2.fq 3' reads?
    You will be able to tell by the strand in the read name.

    Originally posted by genome View Post
    3.

    What does this line in the output mean?

    chr1 745 A R +

    OR

    chr1 15454 - A +

    OR

    chr1 87846 T C -
    Those tell you where the mutations were placed in the reference (SNP/indel). The IUPAC codes give heterozygous positions.

    Originally posted by genome View Post
    4.

    My quality scores are all '1'?
    Use the latest git.

    Originally posted by genome View Post
    5.

    I would like to remove all errors caused by the actual sequencer/technology completely. Would that mean I need to make -e and -E = 0?

    Cheers!~
    Yes.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      04-22-2024, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Today, 08:47 AM
    0 responses
    9 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    60 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    57 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    53 views
    0 likes
    Last Post seqadmin  
    Working...
    X