I have a specific problem. We want to discover isoforms of a gene based on long reads. We have a population of overlapping sequences all under 1000bp. I aligned them with Clustal, and now would like to trim the edges so that only common area for all sequences is remained.
So, I need to convert alignment like this:
>>>>>>>>>>atgcatgcatgcqtgatgctgatgctagtgctagtgctagtcgtagctga
gcatgcatgcatgcatgcatgcqtgatgctgatgctagtgctagtgcta
>>>>>>catgcatgcatgcatgcqtgatgctgatgctagtg
>>>>>>>>>>>>gcatgcatgcatgcqtgatgctgatgctagtgctagt
to
atgcatgcatgcqtgatgctgatgctagtg
atgcatgcatgcqtgatgctgatgctagtg
atgcatgcatgcqtgatgctgatgctagtg
atgcatgcatgcqtgatgctgatgctagtg
Is there a way to do that in Linux? Right now the alignment is in CLC Bio, but I can export it to SAM/BAM or possibly to text as well.
My understanding is that if I trim produce a text file with the alignment, I can then run uniq -c command in Linux and get both unique sequences in my list, and counts (-c) for each of them.
Any advise on how to do that best are appreciated. I am not too handy with PERL or Python though, so shell scripts or some unix tricks would be more desirable.
Thanks!
Thanks for any input
So, I need to convert alignment like this:
>>>>>>>>>>atgcatgcatgcqtgatgctgatgctagtgctagtgctagtcgtagctga
gcatgcatgcatgcatgcatgcqtgatgctgatgctagtgctagtgcta
>>>>>>catgcatgcatgcatgcqtgatgctgatgctagtg
>>>>>>>>>>>>gcatgcatgcatgcqtgatgctgatgctagtgctagt
to
atgcatgcatgcqtgatgctgatgctagtg
atgcatgcatgcqtgatgctgatgctagtg
atgcatgcatgcqtgatgctgatgctagtg
atgcatgcatgcqtgatgctgatgctagtg
Is there a way to do that in Linux? Right now the alignment is in CLC Bio, but I can export it to SAM/BAM or possibly to text as well.
My understanding is that if I trim produce a text file with the alignment, I can then run uniq -c command in Linux and get both unique sequences in my list, and counts (-c) for each of them.
Any advise on how to do that best are appreciated. I am not too handy with PERL or Python though, so shell scripts or some unix tricks would be more desirable.
Thanks!
Thanks for any input
Comment