Seqanswers Leaderboard Ad

**kopi-o** · 11-11-2011, 11:04 AM

Demultiplexing usually allows for one mismatch and sometimes indels, so it's not surprising that this read would have been considered as matching the barcode. Why don't you go through the whole fastq file and check for the barcodes that appear in the headers with something like

grep '@' file.fastq | cut -d '#' -f2 | sort | uniq -c

Maybe your summary was referring to the cluster density for the whole lane, and they had 12 multiplexed samples in it? That would fit approximately with 12*0.5M. 7 million reads for a lane is pretty low, though ...

**eilosei** · 11-11-2011, 12:46 PM

Thank you for your quick reply. I tried the command like

grep '@' 003_s_6_sequence.txt | cut -d '#' -f2 | sort |*uniq -c

But get the notice like

-bash: *uniq: command not found

Beside, the entire fastq file only contains reads from barcode "GNCAAT". As to the summary file, I have three samples in one lane, and each sample has its own summary file after demultiplexing. The PF cluster numbers are similar. So there should be around 10 million PF reads for each, which is still low for HiSeq.

**kopi-o** · 11-11-2011, 01:30 PM

I'm sorry, there was a typo in my reply. The star '*' is not supposed to be there; the command is called uniq. I've removed it.

Anyway, there is no need for you to run the command if you already know that the entire fastq file contains the same barcode. Then it would appear that the facility has failed to extract the exact-match barcode, or that the second cycle in the index read step went so seriously awry that no base calls could be made.

**kopi-o** · 11-11-2011, 02:00 PM

Also, just a crazy suggestion - why not just ask the facility? :-)

**eilosei** · 11-11-2011, 02:03 PM

Now the command works, and all my three samples from one lane have the same problem. The barcode of each read all carries a mismatch "N" at the second character, e.g. CNATGT, ANAGTG, GNCAAT.

I will ask the facility to reprocess the demultiplexing step. Hope it's just a computer problem and I can get reads with perfect match barcodes.

Thanks!

**eilosei** · 11-11-2011, 02:06 PM

Because the facility believed I gave them bad samples, although they said no problem basing on the bioanalyzer result!

Originally posted by kopi-o View Post

Also, just a crazy suggestion - why not just ask the facility? :-)

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 14 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Help! lost data in fastq file

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News