Hi everyone,
we have RNA seq data (without reference genome). We used several assemblers and then united all results with CAP3 - this gave ~60,000 contigs.
The problem is that these contigs are redundant, so the percentage of uniquely aligned reads is very small.
We tried to use TGICL, as suggested by one of the members in this forum.
In the TGICL output - there is an ACE file and singlets file. The problem is that there are ~40,000 contigs that are in the input file before TGICL, but do not appear in the ACE/singlets.
Any help with the TGICL output or other suggestions on how to map the reads to the "redundant" contigs will be appreciated.
we have RNA seq data (without reference genome). We used several assemblers and then united all results with CAP3 - this gave ~60,000 contigs.
The problem is that these contigs are redundant, so the percentage of uniquely aligned reads is very small.
We tried to use TGICL, as suggested by one of the members in this forum.
In the TGICL output - there is an ACE file and singlets file. The problem is that there are ~40,000 contigs that are in the input file before TGICL, but do not appear in the ACE/singlets.
Any help with the TGICL output or other suggestions on how to map the reads to the "redundant" contigs will be appreciated.
Comment