Hello all
I just registered to SEQanswers after a friends recommendation. I work in molecular genomics since 1996, starting in livestock genomics in Barcelona and currently working in the genetics of human diseases. I am currently analysing three overlapping human BACs (between 150 and 180 kb size each and spanning a genomic region of about 340 kb) that have been sequenced with 454 tech (done abroad). The files we have correspond to i) the reads, ii) the assembled reads in contigs, and iii) the ordered contigs (by paired end tag sequencing) into scaffolds. The file types are .fna (fasta), .qual (Phred equivalent quality scores), .tsv (tab delimited file with consensus position-by-position base and flow signal info), sff (input file used in the assembly) and .ace (to be viewed by other viewer programes).
I would like now to align these consensus sequences (all the scaffolds) to the corresponding region on the human reference genome and to other sequences with the aim to find the existing polymorphisms. From the alignment, I will extract a variation table (containing information for each polymorphism on its position, the encompassing sequence, and each allele in each sequence) as an output.
However, I haven't seen any software that fits our requirements (export a list of polymorphisms between compared large sequences). I used several alignment engines as Geniaous, BioEdit, Clustalw, but, as expected, neither the computer nor the alignment tool are powerful enough to do such analysis.
I see three different options to proceed:
1) We can either write our own programe and run it in our computers. Is that the best / only way to proceed?
2) We can download an existing programme and run it in our computers. Can I freely download a programe to do that in our servers?
3) We can use online software where we can upload our sequences and run the analysis. Is there any specific website to do thaty?
In case we decide options 2) or 3), does anyone know about any programe that would align (both pairwise and / or multi-alignment) such large sequences and would also give a list of the polymorphisms.
Thanks in advance
Kindest regards
Alex Clop
Molecular and Medical Genetics
9th Floor Guy's Tower
KCL
St Thomas' St
London SE1 9RT
UK
Tel.: +44(0)20 7188 9505
Fax: +44(0)20 7188 8050
e-mail: [email protected]
I just registered to SEQanswers after a friends recommendation. I work in molecular genomics since 1996, starting in livestock genomics in Barcelona and currently working in the genetics of human diseases. I am currently analysing three overlapping human BACs (between 150 and 180 kb size each and spanning a genomic region of about 340 kb) that have been sequenced with 454 tech (done abroad). The files we have correspond to i) the reads, ii) the assembled reads in contigs, and iii) the ordered contigs (by paired end tag sequencing) into scaffolds. The file types are .fna (fasta), .qual (Phred equivalent quality scores), .tsv (tab delimited file with consensus position-by-position base and flow signal info), sff (input file used in the assembly) and .ace (to be viewed by other viewer programes).
I would like now to align these consensus sequences (all the scaffolds) to the corresponding region on the human reference genome and to other sequences with the aim to find the existing polymorphisms. From the alignment, I will extract a variation table (containing information for each polymorphism on its position, the encompassing sequence, and each allele in each sequence) as an output.
However, I haven't seen any software that fits our requirements (export a list of polymorphisms between compared large sequences). I used several alignment engines as Geniaous, BioEdit, Clustalw, but, as expected, neither the computer nor the alignment tool are powerful enough to do such analysis.
I see three different options to proceed:
1) We can either write our own programe and run it in our computers. Is that the best / only way to proceed?
2) We can download an existing programme and run it in our computers. Can I freely download a programe to do that in our servers?
3) We can use online software where we can upload our sequences and run the analysis. Is there any specific website to do thaty?
In case we decide options 2) or 3), does anyone know about any programe that would align (both pairwise and / or multi-alignment) such large sequences and would also give a list of the polymorphisms.
Thanks in advance
Kindest regards
Alex Clop
Molecular and Medical Genetics
9th Floor Guy's Tower
KCL
St Thomas' St
London SE1 9RT
UK
Tel.: +44(0)20 7188 9505
Fax: +44(0)20 7188 8050
e-mail: [email protected]
Comment