Dear all,
I have paired-end fastq data generated with Illumina bcl2fastqv2.19 & sequenced on a Novaseq.The i5index is 7bp long, the i7 8bp long
R1.fastq.gz contains R1 101bp reads:
R2.fastq.gz contains 6bp UMI sequence
R3.fastq.gz contains R2 101bp reads:
In a downstream analysis I want to use UMI-tools for deduplication. However for that I need the UMI be part of the read name. @Instrument:RunID:FlowCellID:Lane:Tile:X:Y:UMI ReadNum:FilterFlag:0:IndexSequence or SampleNumber
There are tools to add a UMI to the read name when the UMI is present in the read itself. But in my case, the UMI is in a seperate fastq. How could this be achieved?
I have paired-end fastq data generated with Illumina bcl2fastqv2.19 & sequenced on a Novaseq.The i5index is 7bp long, the i7 8bp long
R1.fastq.gz contains R1 101bp reads:
Code:
@A00154:125:HGKTMDMXX:1:1101:10420:1000 1:N:0:AACTGAGG+ATGCGTC CTGGCCGTCTCAGCCGAGAAGCCGAGGATTGAATGGGCATGGAGACTGAACTACCCCTCTCACCTTTAGAGGTGGCTCCTCCAAGTCGGGGTTGACGCCCG + FFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
Code:
@A00154:125:HGKTMDMXX:1:1101:10420:1000 2:N:0:AACTGAGG+ATGCGTC GCGCGT + FFFFFF
Code:
@A00154:125:HGKTMDMXX:1:1101:10420:1000 3:N:0:AACTGAGG+ATGCGTC CTTCATAGGCCACAAAAAGCCCATATATCAGTGTCATCCACTAAGCCTCAGACACTGCAGCACGGGCAGCGGCAGTGCCAGCTTCGCCCACACTGCCCCTC + FFFFFFFFFFFFFFFFFFFFFF:FF:FFF:FFFFFF:FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
There are tools to add a UMI to the read name when the UMI is present in the read itself. But in my case, the UMI is in a seperate fastq. How could this be achieved?
Comment