A bit of an open question looking for debate.
SAM format has a bunch of support for mate pair / paired end sequences in terms of flags and specific fields.
A number of sequencing approaches can generate multiple tags per fragment. For example, Polonator reads are really mate quads. Complete Genomics takes this even farther. Conceptually, SOLiD and perhaps Illumina could generate 4 tags from "jumping" libraries. Helicos and PacBio (and presumably the VisGen/Life technology) can use pulses of unlabeled nucleotides to potentially generate a very large number of linked reads.
Any thoughts to how to accommodate these? Or will a new format be required?
SAM format has a bunch of support for mate pair / paired end sequences in terms of flags and specific fields.
A number of sequencing approaches can generate multiple tags per fragment. For example, Polonator reads are really mate quads. Complete Genomics takes this even farther. Conceptually, SOLiD and perhaps Illumina could generate 4 tags from "jumping" libraries. Helicos and PacBio (and presumably the VisGen/Life technology) can use pulses of unlabeled nucleotides to potentially generate a very large number of linked reads.
Any thoughts to how to accommodate these? Or will a new format be required?
Comment