![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Splitting fastq file by barcodes without producing unmatched.fq file? | a.cardilini | Bioinformatics | 5 | 06-26-2014 03:03 PM |
splitting big paired fastq files | JahnDavik | Bioinformatics | 2 | 04-15-2013 12:16 AM |
splitting fastq? | yaximik | Bioinformatics | 5 | 02-05-2013 08:12 PM |
Splitting concatenated PE fastq to two files for respect reads | JayM | Illumina/Solexa | 5 | 11-05-2010 03:58 AM |
Splitting 454 paired reads in a FASTQ file | sjackman | Bioinformatics | 5 | 09-10-2010 12:09 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Münster, Germany Join Date: Mar 2013
Posts: 44
|
![]()
Hello folks,
I happen to have a small problem which seemed to be trivial at first, but keeps me busy for a while now already. Maybe you can help... Problem: I need to split 615M paired reads currently in two FastQ files into two file pairs with 308M reads each. Solution attempt A: I unsuccessfully tried to use line count based tools like split or awk, but since newline characters occur in the quality scores, these tools respectively I screwed up badly. Solution attempt B: Code:
bbmap/reformat.sh in=... in2=... out=... out2=... reads=308000000 bbmap/reformat.sh in=... in2=... out=... out2=... skipreads=308000000 Input is being processed as paired Input: 615307122 reads 91326404105 bases Output: 615307122 reads (100.00%) 91326404105 bases (100.00%) for the first command and in Input is being processed as paired Input: 615307122 reads 91326404105 bases Output: 0 reads (0.00%) 0 bases (0.00%) for the second command. Effectively those read commands seem to be ignored (BBMap Version 38.76). Solution attempt C: Code:
famas --in=... --in2=... --out=.XXXXXX.fq.gz --out2=.XXXXXX.fq.gz -x 308000000 ERROR(famas.c|open_output_one:1056): Couldn't open =...compressed.065534.fq.gz ERROR(famas.c|main:1163): Couldn't open output files. Exiting... Since it took me a while to clean that mess up on the cluster again, I am somewhat reluctant to try out more now. Any ideas what I made wrong or suggestions which tools work better? Thanks a lot for reading and help! Thias |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,082
|
![]()
Cross-posted and answered at: https://www.biostars.org/p/451453/
|
![]() |
![]() |
![]() |
#3 |
Member
Location: Münster, Germany Join Date: Mar 2013
Posts: 44
|
![]()
Indeed, the issue is solved by now using seqkit split2.
My apologies for not indicating this here. I had posted here first but the thread was lingering in moderation for ~24h and thus I decided to ask for help on Biostars. It had not yet shown up when I got the answer on Biostars and I subsequently forgot to check back here. Thanks a lot to everyone none the less! |
![]() |
![]() |
![]() |
Tags |
awk, bbmap, famas, fastq, split |
Thread Tools | |
|
|