Hello everyone,
I am new in bioinformatics, so I might have to say sorry if the question is a bit odd. I have looked everywhere and didn't find a solution for it.
I am mapping my paired end reads to a viral genome(Nextgenmap). In the bam file, I extracted those that were both mapped and were paired end (I discarded those mapped with mate not unmapped, so I only have paired end data).
With this viral_paired_end_mapped.bam, what I want to do is filter out possible human reads.
What I did is to convert the viral_paired_end_mapped.bam into fastq file (R1 and R2) and then map it again to hg19. And in this final bam file, I filter only the unmapped paired end (Viralmapped_Humanunmapped paired end). Those are the sequences that I want (mapped to virus and not to human, paired end)
I have to continue now to do coverage analysis,etc. So I need to merge several bam files, according to sample id. My question is then:
Is it possible to merge the "Viralmapped_Humanunmapped paired end.bam"and then do a coverage analysis with the viral reference?
Or should I somehow go back to my first bam file (viral_paired_end_mapped.bam) and somehow filter out the sequence IDs from "Viralmapped_Humanunmapped paired end.bam"?
If so, does anyone know how to do it?
I am new in bioinformatics, so I might have to say sorry if the question is a bit odd. I have looked everywhere and didn't find a solution for it.
I am mapping my paired end reads to a viral genome(Nextgenmap). In the bam file, I extracted those that were both mapped and were paired end (I discarded those mapped with mate not unmapped, so I only have paired end data).
With this viral_paired_end_mapped.bam, what I want to do is filter out possible human reads.
What I did is to convert the viral_paired_end_mapped.bam into fastq file (R1 and R2) and then map it again to hg19. And in this final bam file, I filter only the unmapped paired end (Viralmapped_Humanunmapped paired end). Those are the sequences that I want (mapped to virus and not to human, paired end)
I have to continue now to do coverage analysis,etc. So I need to merge several bam files, according to sample id. My question is then:
Is it possible to merge the "Viralmapped_Humanunmapped paired end.bam"and then do a coverage analysis with the viral reference?
Or should I somehow go back to my first bam file (viral_paired_end_mapped.bam) and somehow filter out the sequence IDs from "Viralmapped_Humanunmapped paired end.bam"?
If so, does anyone know how to do it?