Dear all,
We have sequenced human BAC clones using 454 sequencing technology. During the assembling process into a consensus sequence (CONS) using in parallel two reference sequences, many reads were not incorporated into the corresponding resulting consensus.
Afterwards, I did a De Novo Assembly (using no reference sequence) of these unmapped reads and I am currently analysing the resulting contigs.
I had two different scenarios: (A) the resulting contigs correspond to the cloning vector or to E. coli DNA (traces of bacterial DNA not eliminated during the maxiprep); (B) some other contigs, do map to our human target region
(A) This is for most of the contigs and those being the longest and having the deepest read coverage.
(B) When mapping these contigs into reference sequences, some of them behave similar than Paired End Tags but with different orientations or distance between the aligned segments than the expected in PET (3 kb in our case). I do not believe they correspond to structural variation between my template and these references.
It is worth to mention that i) most of those contigs in (B) scenario are around 200 – 500 bp and none exceed 1300 bp ii) whilst the coverage in CONS is around 80 fold, the coverage of the ctg is for most of them between 2 and 3 and few of them exceed 10 fold coverage.
Has anyone I would appreciate if i) anyone that has observed these kind of reads / contigs in their 454 analysis could let me know.
I am also wondering how common is this type of reads / contigs and why is that occurring? Does anyone know?
Thank you in advance for your help.
With kindest regards
Alex
We have sequenced human BAC clones using 454 sequencing technology. During the assembling process into a consensus sequence (CONS) using in parallel two reference sequences, many reads were not incorporated into the corresponding resulting consensus.
Afterwards, I did a De Novo Assembly (using no reference sequence) of these unmapped reads and I am currently analysing the resulting contigs.
I had two different scenarios: (A) the resulting contigs correspond to the cloning vector or to E. coli DNA (traces of bacterial DNA not eliminated during the maxiprep); (B) some other contigs, do map to our human target region
(A) This is for most of the contigs and those being the longest and having the deepest read coverage.
(B) When mapping these contigs into reference sequences, some of them behave similar than Paired End Tags but with different orientations or distance between the aligned segments than the expected in PET (3 kb in our case). I do not believe they correspond to structural variation between my template and these references.
It is worth to mention that i) most of those contigs in (B) scenario are around 200 – 500 bp and none exceed 1300 bp ii) whilst the coverage in CONS is around 80 fold, the coverage of the ctg is for most of them between 2 and 3 and few of them exceed 10 fold coverage.
Has anyone I would appreciate if i) anyone that has observed these kind of reads / contigs in their 454 analysis could let me know.
I am also wondering how common is this type of reads / contigs and why is that occurring? Does anyone know?
Thank you in advance for your help.
With kindest regards
Alex
Comment