Seqanswers Leaderboard Ad

**maubp** · 07-28-2011, 12:17 PM

Have you looked at the input FASTQ file, especially at the area around line 34?

Data can get corrupted in transfer...

**Richard Finney** · 07-28-2011, 04:28 PM

What are the commands you use?
Is it in the "aln" program or the subsequent "sampe" program?
Did the aln finish?
Is there junk in the intermediate "sai" file? ( run "strings filename.sai | less" and glance for readable error messages). It's easy to accidentally send stderr to stdout in the aln phase and use this to pump into "sampe". bwa is merciless in expecting correct input.

**Robby** · 07-28-2011, 11:17 PM

Thanks a lot for your answers.

I checked the fastq-file, but couldn't see any strange data. But I am not really sure, what I have to check exactly. So I had just a look, if the sequence length is the same as the quality scores and if the line breaks are correct. But everything seems to be OK. The onliest difference to other fastq-files is, that in the third line of each sequence is just a "+" instead of "+" and sequence name. But I thought that this is maybe due to the new Illumina format. Does anyone have more information regarding that line?

The error message occurs in the "sampe" program. The "aln program finished without any error message. I had a look into the sai-files, but I don't understand the output. But at least I couldn't find an error message in these files and the files have the expected size.

This was the first run with the new Illumina v3 chemistry and the new fastq format (i.e. Sanger quality scores). Do I have to change in that case anything?

I used the following commands:
bwa aln -I -l 35 ref.fasta reads1.fastq.gz > align1.sai
bwa aln -I -l 35 ref.fasta reads2.fastq.gz > align2.sai
bwa sampe ref.fasta align1.sai align2.sai reads1.fastq.gz reads2.fastq.gz > mapping.sam

I tried the aln-commands without the -I option (for Illumina 1.3+) as well, but received the same error message.

**gaffa** · 07-29-2011, 12:30 AM

Originally posted by Robby View Post

I checked the fastq-file, but couldn't see any strange data. But I am not really sure, what I have to check exactly. So I had just a look, if the sequence length is the same as the quality scores and if the line breaks are correct. But everything seems to be OK. The onliest difference to other fastq-files is, that in the third line of each sequence is just a "+" instead of "+" and sequence name. But I thought that this is maybe due to the new Illumina format. Does anyone have more information regarding that line?

That's fine, the read name is not needed at the third line, just a "+" is valid fastq and recognized by BWA.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 44 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 43 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 38 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

BWA error

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News