I ran a RNASeq pipeline for human skin samples using TopHat2 and hg19 as the reference genome and, because I wanted to find contaminants from other organisms, I used the reads contained in unmapped.bam as input (after convert them to fastq files) for de novo RNASeq with Trinity.
I don't understand why some contigs generated by Trinity map with 100% identity in about 3000 bp alignment to ensembl transcripts.
Why the reads from which that contigs were generated were in unmmaped.bam file produced by TopHat?
Any idea?
I don't understand why some contigs generated by Trinity map with 100% identity in about 3000 bp alignment to ensembl transcripts.
Why the reads from which that contigs were generated were in unmmaped.bam file produced by TopHat?
Any idea?
Comment