Nowadays, most NGS assembly pipelines end in a scaffolds FASTA file. Typically, we have no assembly files (ace, contig, afg, among others). Thus, all we have to work is a scaffolds file (FASTA) and the original reads (typically FASTQ). SAM files could be easily generated by aligning the reads to reference. The question is: is it posible to use these information to load an AMOS BANK and run some very useful pipelines, such as amosvalidate? Has anyone scripted something (sam2ace, sam2afg, sam2amos, bam2ace, bam2amos or anything) and is willing to share it?
I think it would be very interesting to develop such piece of software and it would be very useful, so we could take advantage of the powerful AMOS tools. I understand that there are technical problems in going from SAM to an assembly format. For example, what to do with reads that more than one mapping? Proper read pairing information is also crucial to get an assembly format that is useful to amosvalidate, for example. There might be many other problems in this conversion that I don't know yet.
Any other ideas do you want to share? If this tool does not exists, I would be very willing to develop and share it.
I think it would be very interesting to develop such piece of software and it would be very useful, so we could take advantage of the powerful AMOS tools. I understand that there are technical problems in going from SAM to an assembly format. For example, what to do with reads that more than one mapping? Proper read pairing information is also crucial to get an assembly format that is useful to amosvalidate, for example. There might be many other problems in this conversion that I don't know yet.
Any other ideas do you want to share? If this tool does not exists, I would be very willing to develop and share it.
Comment