SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Maq reads split problem (http://seqanswers.com/forums/showthread.php?t=3054)

cliff 11-07-2009 09:11 AM

Maq reads split problem
 
Hi, All

I just got a problem when splitting 4M reads from read1 in one lane (Illumina GA) into 1-million chunk. I tried:

maq fastq2bfq -n 1000000 s_1_1_sequence.fastq s_1_1_sequence

and I got 5 output files:

s_1_1_sequence@1.bfq
s_1_1_sequence@1000001.bfq
s_1_1_sequence@2000001.bfq
s_1_1_sequence@3000001.bfq
s_1_1_sequence@4000001.bfq

I understand these four files:

s_1_1_sequence@1000001.bfq
s_1_1_sequence@2000001.bfq
s_1_1_sequence@3000001.bfq
s_1_1_sequence@4000001.bfq

are the output split files. But I don't know what "s_1_1_sequence@1.bfq" is.
I tried this command on read 2 from the same land and got the same problem. It output "s_1_2_sequence@1.bfq". It also happens to all the other flow cells.

Also, when I changed it to 2-million chunk, I still got this @1.bfq file.

I checked the size of this @1.bfq file and it is similar to the other @1000001. bfq or other .bfq output files.

Can anybody tell me what this @1.bfq file is? Should I remove it for further mapping?

Thanks in advance

-Cliff

ECO 11-07-2009 03:29 PM

@1... is just the first batch of split reads. (Which is why it's the same size as all the rest, save for the last one, which is variable in size depending on how many reads you have left over in relation to the (#reads)/(split size).

So, you want all 5 files. :)


All times are GMT -8. The time now is 05:58 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.