Hi fellow colleagues, I have been looking around the forum and google searching and it would seem my issue is a little obscure.
I am performing in-silico analysis of raw level 1 TCGA sequence reads and was hoping to generate "pseudo"-simulated FASTQ files. In the sense, that from the BAM file and unaligned FASTQ file from the same individual, I would like to recreate the full FASTQ file to replicate the TCGA pipeline and then test on some bespoke pipelines I am trying to get a handle on.
What I have done so far is:
1. use samtools bamshuf to randomise the BAM file.
2. convert the randomised BAM file to FASTQ.
This is where I am stuck because if I catenate this FASTQ file to the unaligned FASTQ files all I would get is the unaligned reads stacked right after all those within the BAM file.
Is there a tool out there that can randomize the order of sequence reads in a FASTQ file? And it would be for paired end data as well, hoping to preserve the paired ends at the same time?
Apologies if I had not refined my searches and there is something obvious available. I am hoping there is something short of learning a programming language and writing the script from scratch!!!
Cheers
Nick
I am performing in-silico analysis of raw level 1 TCGA sequence reads and was hoping to generate "pseudo"-simulated FASTQ files. In the sense, that from the BAM file and unaligned FASTQ file from the same individual, I would like to recreate the full FASTQ file to replicate the TCGA pipeline and then test on some bespoke pipelines I am trying to get a handle on.
What I have done so far is:
1. use samtools bamshuf to randomise the BAM file.
2. convert the randomised BAM file to FASTQ.
This is where I am stuck because if I catenate this FASTQ file to the unaligned FASTQ files all I would get is the unaligned reads stacked right after all those within the BAM file.
Is there a tool out there that can randomize the order of sequence reads in a FASTQ file? And it would be for paired end data as well, hoping to preserve the paired ends at the same time?
Apologies if I had not refined my searches and there is something obvious available. I am hoping there is something short of learning a programming language and writing the script from scratch!!!
Cheers
Nick
Comment