Hi all,
I am trying to assemble plant mitochondria genome. The method I follow is to extract mitochondria reads from genomic reads (sequenced WGS approach using hiseq 2000, illumina paired-end reads)
1. I have downloaded all mitochondrial genomes of plants and indexed as reference genome using BWA
2. The raw paried-end reads were filtered (adapter & low quality reads filtered) which passed fastqc tool test. The fastqc passed filtered reads were interleaved using using perl script and used as single-end sequence. These single-end sequence were mapped to mitochondiral reference genome using BWA
3. Then mapped reads are extracted using samtools -F 4 option and got output in bam format
4. Using picard, bam format converted to fastq format
5.Before doing denovo assembly, I checked with fastqc, it failed in following
(i)FAIL-Per sequence GC content
(ii)FAIL-Sequence Duplication Levels
(iii)FAIL-Overrepresented sequences
(iv)FAIL-Kmer Content
My questions
(i) what I can I improve the reads before denovo assembly of mitochondrial reads?
(ii) Which better tool to assembly mitochondrial genome velvet or soapdenovo?. How much kmer size can be used?
I am trying to assemble plant mitochondria genome. The method I follow is to extract mitochondria reads from genomic reads (sequenced WGS approach using hiseq 2000, illumina paired-end reads)
1. I have downloaded all mitochondrial genomes of plants and indexed as reference genome using BWA
2. The raw paried-end reads were filtered (adapter & low quality reads filtered) which passed fastqc tool test. The fastqc passed filtered reads were interleaved using using perl script and used as single-end sequence. These single-end sequence were mapped to mitochondiral reference genome using BWA
3. Then mapped reads are extracted using samtools -F 4 option and got output in bam format
4. Using picard, bam format converted to fastq format
5.Before doing denovo assembly, I checked with fastqc, it failed in following
(i)FAIL-Per sequence GC content
(ii)FAIL-Sequence Duplication Levels
(iii)FAIL-Overrepresented sequences
(iv)FAIL-Kmer Content
My questions
(i) what I can I improve the reads before denovo assembly of mitochondrial reads?
(ii) Which better tool to assembly mitochondrial genome velvet or soapdenovo?. How much kmer size can be used?
Comment