Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Taxonomic Analysis of MiSeq Data

    Hello everyone,

    Recently I have completed a MiSeq run with 63 samples multiplexed.

    I am currently attempting to analyse this data for small viral genomes obtained from chicken gut homogenates, with the expected viruses being around 7kb - 10kb in length.

    Normally my go to route with smaller datasets would be to run my contig files through a local BLAST nr nt database and using the resulting .xml file in MEGAN to perform a taxonomic analysis.

    Unfortunately, with the MiSeq dataset being so large it is very time consuming to go this route sample by sample especially since the computer we are using is a tad underpowered.

    I was hopeful when I saw all the apps available on BaseSpace however there does not seem to be an appropriate viral categorisation tool or taxonomic tool. I attempted to use Kraken but I was getting very little viral output when comparing the same sample to a BLAST & MEGAN analysis.

    I was wondering if anyone had any tips for this type of analysis.

    Thanks for the help.

  • #2
    Hi GSviral,

    Have you solved the problem on BLASTing huge data?
    I am also trying to detect viruses in clinical samples like frozen plasma (by Hiseq 2000).
    I also tried Kraken, but it missed a lot of reads which should be mapped to some viruses by BLAST.
    BLAST is more accurate, accepts long gap, but very slow.

    To treat huge amount of data (some millions of reads FASTA files),
    First, I Removed huge amount of host geonome by BOWTIE2 (it is ultarfast and accepts long gap like splicing).
    Then, I divided FASTA files into 1,000,000 reads per files, and run BLAST+ in parallel.
    It still needs 1-2 days by supercomputer (4 core CPU/32GB memories per job), and needs a week for a workstation ...
    But I think BLAST is a right tool for your to correctly find multiple virus in samples.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      Yesterday, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    57 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    53 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    45 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    55 views
    0 likes
    Last Post seqadmin  
    Working...
    X