Hi All,
I’d like to do a hybrid assembly of 454 genomic sequence and illumina RNAseq data using MIRA. Unfortunately, I’m not really sure how to do an assembly with 2 different technologies AND 2 different types of data (genomic DNA from 454 and RNA from Illumina). I’m concerned that highly expressed genes will be flagged as repetitive sequence because reads from those genes will be over-represented in the sample.
I have a draft assembly of a bacterial genome generated with 454 reads using newbler. However, when I mapped my RNAseq data to this assembly, only a very low percentage of reads mapped back to the genome. I did this previously for a different bacterial isolate, and everything worked well (>97% of reads mapped to genome generated from assembly of 454 data). However, for this latest strain, I think there is a problem with either the coverage or the assembly, so I’d like to use as much data as possible to get a better assembly.
Does anyone have any advice on the best way to incorporate both data types into a denovo genome assembly using MIRA?
Thanks for all your help.
I’d like to do a hybrid assembly of 454 genomic sequence and illumina RNAseq data using MIRA. Unfortunately, I’m not really sure how to do an assembly with 2 different technologies AND 2 different types of data (genomic DNA from 454 and RNA from Illumina). I’m concerned that highly expressed genes will be flagged as repetitive sequence because reads from those genes will be over-represented in the sample.
I have a draft assembly of a bacterial genome generated with 454 reads using newbler. However, when I mapped my RNAseq data to this assembly, only a very low percentage of reads mapped back to the genome. I did this previously for a different bacterial isolate, and everything worked well (>97% of reads mapped to genome generated from assembly of 454 data). However, for this latest strain, I think there is a problem with either the coverage or the assembly, so I’d like to use as much data as possible to get a better assembly.
Does anyone have any advice on the best way to incorporate both data types into a denovo genome assembly using MIRA?
Thanks for all your help.