SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
GATK base quality recalibration suppose to keep old and new quality scores? Heisman Bioinformatics 2 10-21-2011 07:40 AM
Illumina quality scores dlepp Illumina/Solexa 6 02-28-2011 11:09 PM
Illumina 1.3 v 1.8 quality scores Graham Etherington Bioinformatics 1 10-18-2010 07:00 AM
mira quality scores skingan De novo discovery 0 08-10-2010 06:17 AM
fastq quality scores bioxyz Bioinformatics 2 11-25-2009 03:28 PM

Reply
 
Thread Tools
Old 02-16-2011, 01:30 PM   #1
genome
Junior Member
 
Location: new zealand

Join Date: Nov 2010
Posts: 3
Lightbulb dwgsim -> readnames and quality scores

Hi,

I am using dwgsim from the dnaa package to simulate short read pairs from a reference genome.

What I eventually want to do is assess different aligners on the basis of the reads that align to the right position.

1.

I was looking at the documentation on http://sourceforge.net/apps/mediawik...ome_Simulation

and I'm trying to understand what my readname means, it is:

Chr1_1514706_1515213_1:1:0_1:0:0_0/1

So,

Chr1 = contig name
1514213 = start position 1 (from the mutated reference?)
1515213 = start position 2

and then I don't get the rest..?

2.

Are reads in _1.fq 5' reads and those in _2.fq 3' reads?

3.

What does this line in the output mean?

chr1 745 A R +

OR

chr1 15454 - A +

OR

chr1 87846 T C -

4.

My quality scores are all '1'?

5.

I would like to remove all errors caused by the actual sequencer/technology completely. Would that mean I need to make -e and -E = 0?

Cheers!~
genome is offline   Reply With Quote
Old 02-16-2011, 06:17 PM   #2
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by genome View Post
Hi,
1.

I was looking at the documentation on http://sourceforge.net/apps/mediawik...ome_Simulation

and I'm trying to understand what my readname means, it is:

Chr1_1514706_1515213_1:1:0_1:0:0_0/1

So,

Chr1 = contig name
1514213 = start position 1 (from the mutated reference?)
1515213 = start position 2

and then I don't get the rest..?
Are you using the version from the git repository and latest commit? After you update, look at the section "read names explained".

Quote:
Originally Posted by genome View Post
2.

Are reads in _1.fq 5' reads and those in _2.fq 3' reads?
You will be able to tell by the strand in the read name.

Quote:
Originally Posted by genome View Post
3.

What does this line in the output mean?

chr1 745 A R +

OR

chr1 15454 - A +

OR

chr1 87846 T C -
Those tell you where the mutations were placed in the reference (SNP/indel). The IUPAC codes give heterozygous positions.

Quote:
Originally Posted by genome View Post
4.

My quality scores are all '1'?
Use the latest git.

Quote:
Originally Posted by genome View Post
5.

I would like to remove all errors caused by the actual sequencer/technology completely. Would that mean I need to make -e and -E = 0?

Cheers!~
Yes.
nilshomer is offline   Reply With Quote
Reply

Tags
dnaa, dwgsim, output, quality, readname

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:39 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO