Hi all.
I'm trying to identify genes and annotate its corresponding region. My starting points are the original genes (from a library) from one species and the WGS assembly from another sp.
So first, I get all contigs which have any similarity (tblastx) my gene and tried to construct the region surrounding each family contig (blastn from the original assembly).
My question is: I know which contigs are related in the same cluster (blastn similarities), but when I make a multiple alignment sequence of this related contigs...it's a mess. Maybe the algorithms in clustal or t-coffee are not suitable for these purposes. What do you recommend to construct a super-contig?
Would you use a diferent workflow / pipeline for these hunting in database assemblies??
Thanks (and sorry, I'm a completely newbie in these issues)
I'm trying to identify genes and annotate its corresponding region. My starting points are the original genes (from a library) from one species and the WGS assembly from another sp.
So first, I get all contigs which have any similarity (tblastx) my gene and tried to construct the region surrounding each family contig (blastn from the original assembly).
My question is: I know which contigs are related in the same cluster (blastn similarities), but when I make a multiple alignment sequence of this related contigs...it's a mess. Maybe the algorithms in clustal or t-coffee are not suitable for these purposes. What do you recommend to construct a super-contig?
Would you use a diferent workflow / pipeline for these hunting in database assemblies??
Thanks (and sorry, I'm a completely newbie in these issues)