![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
BWA & Index Preloading for SOLiD Reads | dp05yk | Bioinformatics | 0 | 05-10-2011 01:37 PM |
Simulated reads mapping to same region - maq simulate & dwgsim | gprakhar | Bioinformatics | 2 | 02-19-2011 12:12 AM |
Duplicated bases in 100 bp GA2 reads | wraithnot | Illumina/Solexa | 4 | 10-26-2010 02:04 PM |
Filter Illumina reads & maintain PE ordering | k-gun12 | Bioinformatics | 2 | 08-21-2010 03:25 PM |
Paired-end & shotgun reads | nickloman | 454 Pyrosequencing | 4 | 03-11-2010 01:22 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: US Join Date: Apr 2011
Posts: 20
|
![]()
Hi,
I am new to bioinformatics and appreciate your help answering this question. I have a reads file [a sample is attached] and want to count the number of #bases and number of reads in this file, how do I do this? Is there a relation between the #bases and # reads in a reads file? i,e if the #bases information is available for a particular reads file, can I immediately calculate the number of reads? Thanks. |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: US Join Date: Jan 2009
Posts: 392
|
![]()
Assuming that all the reads are of a uniform length and have not been modified in some way (trimmed), then each read will be the same number of bases.
In your sample fastq file: HTML Code:
@SRR019388.1.1 SLXA-EAS1_126_FC20H6L_0_7_1_490_23.1 length=35 GTCAAATATAGTGAGTACAGGAAAATAGGTGGAGA +SRR019388.1.1 SLXA-EAS1_126_FC20H6L_0_7_1_490_23.1 length=35 <<<<<<<<<<<<;<<<<<<<<<<<<<<9<<;;7<; |
![]() |
![]() |
![]() |
#3 | |
Rick Westerman
Location: Purdue University, Indiana, USA Join Date: Jun 2008
Posts: 1,104
|
![]() Quote:
There is no relation between #bases and #reads. |
|
![]() |
![]() |
![]() |
#4 | |
Rick Westerman
Location: Purdue University, Indiana, USA Join Date: Jun 2008
Posts: 1,104
|
![]() Quote:
Actually from the sample file, 'CS student' should be able to figure out how many reads and the length of the reads (since this is given) via simple unix tools --- which any CS student should know. I'd use 'grep', 'cut', 'sort', and 'uniq'. |
|
![]() |
![]() |
![]() |
#5 | |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,082
|
![]() Quote:
http://en.wikipedia.org/wiki/FASTQ_format |
|
![]() |
![]() |
![]() |
#6 |
Member
Location: US Join Date: Apr 2011
Posts: 20
|
![]()
Thank you very much for your explainations. This is all what I wanted to know about # reads and #bases.
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|