Hello,
I've just started working with Illumina 1.8+ format fastq files, and the read identifiers don't contain the /1 or /2 suffixes for PE reads any more, and this information is contained in column 2.
I'm not scripting savvy, and i'm just using a very clumsy perl script to add this information:
I'm wondering if anyone can share a good sed or awk command or script to do this more efficiently/quickly as I'm working with a few Gb to tens of Gb+ sized files. I've been trying but can't get the syntax right.
thanks,
I've just started working with Illumina 1.8+ format fastq files, and the read identifiers don't contain the /1 or /2 suffixes for PE reads any more, and this information is contained in column 2.
I'm not scripting savvy, and i'm just using a very clumsy perl script to add this information:
Code:
perl -e ' while(<>) { @cols=split("\t", $_); $cols[0] = "$cols[0]/1"; print "$cols[0]\t$cols[1]\t$cols[2]\t$cols[3]\t$cols[4]\t$cols[5]\t$cols[6]\t$cols[7]\t$cols[8]\t$cols[9]\t$cols[10]\t$cols[11]"; } exit '
thanks,
Comment