Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
MiSeq data analysis using Galaxy ramujana Bioinformatics 0 09-23-2015 01:48 AM
Analysis of MiSeq Illumina data, how calculate the PF reads?? mruizm Illumina/Solexa 4 05-02-2013 08:54 AM
Literature on MiSeq data analysis Smriti Bioinformatics 1 02-26-2013 01:00 AM
Mutascope: Analysis of MiSeq T-N Data shawn yost Illumina/Solexa 0 01-27-2013 06:19 PM
Help with De-Multiplexing MiSeq Data Cirno Bioinformatics 8 08-16-2012 01:51 PM

Thread Tools
Old 10-06-2015, 05:24 AM   #1
Location: UK

Join Date: Dec 2014
Posts: 33
Default Taxonomic Analysis of MiSeq Data

Hello everyone,

Recently I have completed a MiSeq run with 63 samples multiplexed.

I am currently attempting to analyse this data for small viral genomes obtained from chicken gut homogenates, with the expected viruses being around 7kb - 10kb in length.

Normally my go to route with smaller datasets would be to run my contig files through a local BLAST nr nt database and using the resulting .xml file in MEGAN to perform a taxonomic analysis.

Unfortunately, with the MiSeq dataset being so large it is very time consuming to go this route sample by sample especially since the computer we are using is a tad underpowered.

I was hopeful when I saw all the apps available on BaseSpace however there does not seem to be an appropriate viral categorisation tool or taxonomic tool. I attempted to use Kraken but I was getting very little viral output when comparing the same sample to a BLAST & MEGAN analysis.

I was wondering if anyone had any tips for this type of analysis.

Thanks for the help.
GSviral is offline   Reply With Quote
Old 01-11-2016, 04:15 PM   #2
Junior Member
Location: Japan

Join Date: Jul 2015
Posts: 4

Hi GSviral,

Have you solved the problem on BLASTing huge data?
I am also trying to detect viruses in clinical samples like frozen plasma (by Hiseq 2000).
I also tried Kraken, but it missed a lot of reads which should be mapped to some viruses by BLAST.
BLAST is more accurate, accepts long gap, but very slow.

To treat huge amount of data (some millions of reads FASTA files),
First, I Removed huge amount of host geonome by BOWTIE2 (it is ultarfast and accepts long gap like splicing).
Then, I divided FASTA files into 1,000,000 reads per files, and run BLAST+ in parallel.
It still needs 1-2 days by supercomputer (4 core CPU/32GB memories per job), and needs a week for a workstation ...
But I think BLAST is a right tool for your to correctly find multiple virus in samples.
javauma is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 09:02 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO