Seqanswers Leaderboard Ad

**kmcarr** · 04-02-2009, 04:47 AM

The *_seq.txt files in the Bustard directory include the sequence of every cluster (read). The *_sequences.txt files in the GERALD directory only include the passed filter reads.

**lajoieb** · 04-02-2009, 05:00 AM

Ah ok!
That makes perfect sense then.

Appreciate the explanation.

bryan

**Torst** · 04-08-2009, 05:52 PM

Originally posted by lajoieb View Post

wc -l s_*_sequence.txt (fastq)

To count the number of reads in a FASTQ file, you can use grep in -c (counting) mode:

Code:

grep -c '^\+' *_sequence.txt

s_6_sequence.txt:4525658
s_7_sequence.txt:4485601
s_8_sequence.txt:4099309

Note you can't match on '@' as that is a valid quality value (Q=0), wherease '+' is not used in Illumina FASTQ. This is easier than dividing by four in your head... although the shell can help us there too:

Code:

echo $[ `wc -l < s_6_sequence.txt` / 4 ]

4525658

Enjoy

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 55 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 52 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 45 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

solexa output files | s__seq.txt vs. s__sequencece.txt

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News

Seqanswers Leaderboard Ad

Announcement

solexa output files | s_*_seq.txt vs. s_*_sequencece.txt

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News

solexa output files | s__seq.txt vs. s__sequencece.txt