Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to use sam files in MEGAN metagenomics

    hello everybody..

    I have two environmental bacteria data sequenced on Illumina for metagenomics (approx 14 million paired-end reads for one dataset and 16 million for the other. ~70 bp read length). Since I knew that the sequences consists of only bacteria, I've downloaded the all bacteria sequences from NCBI ( 14 GB file size) instead of downloading nr/nt database and started standalone blast as suggested in MEGAN manual. It continuously ran for 9 constant days and then i had to stop the process, since the blast result file size was more than 45 GB. I know this is not a memory issue. Then I did the alignment with bowtie (bowtie-0.12.7) and it gave me the sam alignment file (7 GB and 12 percent of the reads got aligned to the reference). I also downloaded GI to NCBI taxon id file from megan website ( the bin file). Now I uploaded both the files ( sam and bin) file as exactly mentioned in the manual and it gives me no result, somehow.

    Can you please help me as to what I did wrong..

    I appreciate your help

    Christopher

  • #2
    Hi Chris,

    BLAST using Illumina reads is not recommended due to extreme computational challenges. Before getting into your experiment design, can you share what you had intended to achieve for your sequencing project?

    Best regards,
    Douglas

    Comment


    • #3
      Perhaps run something like Qiime first. It will do 16S identification and will reduce the size of your dataset (as a fasta file) so you can run it in MEGAN. I assume you're using MEGAN for functional analysis?

      Comment


      • #4
        MetaPhlAn may be a right tool for this.

        Best regards,
        Douglas

        Comment


        • #5
          thanks for replies..

          well, i want to have a complete metagenomics analysis as to how many and what species are in the sample and phylogeny too.. is this what this program let me do it..

          chris

          Comment


          • #6
            MetaPhlAn can do that.

            Best regards,
            Douglas

            Comment


            • #7
              I suspect MEGAN might not be able to parse the taxa id from your alignment results because the format is slightly different in the database you're using. You might be able to tweak it to get it working.

              Blastx against nr might be doable if you have access to a cluster - I blasted an Illumina dataset about the size of yours, just chopping it into little pieces and farming it out to separate nodes. I had to buy more memory to run MEGAN on it, though.

              Comment


              • #8
                thank you all for your replies


                I used metaphlan with the marker db that is provided by them and very happy with the results, but if I want to map against the database that Ive downloaded from NCBI, is it possible? because as far as I have understood is that database comprises of ~2800 genome markers and in this case there are chances that we might be losing on information on genomes which are currently not present in that list. I'm sorry if I am completely wrong, I'm novice and trying to understand it

                christopher

                Comment


                • #9
                  Hi Chris,

                  Please read the paper on MetaPhlAn. The authors screened for representative genes in each family/class. If you use a general database, I am not sure if the results are useful or not. I recommend you contact the author(s) to discuss.

                  Best regards,
                  Douglas

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 03-27-2024, 06:37 PM
                  0 responses
                  12 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-27-2024, 06:07 PM
                  0 responses
                  11 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  53 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  69 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X