This is my first post so I would like to start with a hello to all SEQanswears users
What I would like to do is to map the coverage/abundance over the whole mt genome. I would be very thankful for any tips and advices in the best way doing this (tools, scripts, programs).
l want to make a graph with the mt genome on the x-axis and coverage/abundance of reads on y-axis. Yes, the mitochondrial genome only have one starting point for transcription and I guess I could expect an homogenous distribution over the whole length. But when I have tried to assemble the mtgenome with 454 reads it is not complete and some regions get a lot of hits with reads starting at the same position (technical artefact?)
First-strand synthesis was done from the polyA end towards the 5' end and Sequencing directed from the 5' end (directionally sequenced EST-library).
What I have is:
1) a non-normalized EST-library from about one 454 run (5 sff files) and there are a lot of sequences with a mitochondrial origin in the library
2) a complete mitochondria genome (circular, 16.150 nucleotides, sanger sequenced with proof reading taq).
First I make up my mind if I should use the reads or the contigs. When I blast the mt-genome against all contigs (46.375) I get 5.594 hits with a cut-off value of 1.0e-03. The bit score value range from 1.300 to 50 hence sometimes nearly the whole length of the contigs is matched and sometimes just a tiny fractions of the contig (Which I find strange). I am afraid that this might mess up graph! I may be able to make a perl-script (have very basic skills, but with enough time..) that parse out just the aligned parts into a new fasta file and then find a program that can map/plot those against the mtgenome. Would it that fasta file be possible to use Mira and make a new assembly with the mtgenome file as a scaffold. The reason to why I ask it because it produce a .ace file that can be read by Tablet (too get a stacked graph that visualise the abundance of sequences over the mtgenome.
Do you think this is a good approach or do you have any other suggestions?
All guidance is very appreciated
Mikael
What I would like to do is to map the coverage/abundance over the whole mt genome. I would be very thankful for any tips and advices in the best way doing this (tools, scripts, programs).
l want to make a graph with the mt genome on the x-axis and coverage/abundance of reads on y-axis. Yes, the mitochondrial genome only have one starting point for transcription and I guess I could expect an homogenous distribution over the whole length. But when I have tried to assemble the mtgenome with 454 reads it is not complete and some regions get a lot of hits with reads starting at the same position (technical artefact?)
First-strand synthesis was done from the polyA end towards the 5' end and Sequencing directed from the 5' end (directionally sequenced EST-library).
What I have is:
1) a non-normalized EST-library from about one 454 run (5 sff files) and there are a lot of sequences with a mitochondrial origin in the library
2) a complete mitochondria genome (circular, 16.150 nucleotides, sanger sequenced with proof reading taq).
First I make up my mind if I should use the reads or the contigs. When I blast the mt-genome against all contigs (46.375) I get 5.594 hits with a cut-off value of 1.0e-03. The bit score value range from 1.300 to 50 hence sometimes nearly the whole length of the contigs is matched and sometimes just a tiny fractions of the contig (Which I find strange). I am afraid that this might mess up graph! I may be able to make a perl-script (have very basic skills, but with enough time..) that parse out just the aligned parts into a new fasta file and then find a program that can map/plot those against the mtgenome. Would it that fasta file be possible to use Mira and make a new assembly with the mtgenome file as a scaffold. The reason to why I ask it because it produce a .ace file that can be read by Tablet (too get a stacked graph that visualise the abundance of sequences over the mtgenome.
Do you think this is a good approach or do you have any other suggestions?
All guidance is very appreciated
Mikael
Comment