Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ssharma
    Member
    • Oct 2010
    • 19

    Illumina Data Analysis

    Hi All,
    I am new member to this forum.
    Earlier i used to work with 454 data, now i am switching to illumina.
    I am getting around 300 million reads (100bp) and its a metagenomic sample. So i am really confused about how to start my analysis.
    Earlier i used approaches like blastx but now i think this is not a good option.
    So i was just wondering if anyone had done something like this or have some idea on this.

    I would really appreciate your help.

    Thanks
    SS
  • adamdeluca
    Member
    • Jul 2010
    • 95

    #2
    check out MG-RAST.

    Comment

    • MadsAlbertsen
      Member
      • Aug 2010
      • 26

      #3
      You should check: A human gut microbial gene catalogue established by metagenomic sequencing. (doi:10.1038/nature08821)

      To my knowledge this is the only study who have published large scale metagenomics using illumina/short reads.

      rgds
      MA

      Comment

      • ssharma
        Member
        • Oct 2010
        • 19

        #4
        @adamdeluca , thanks for the suggestion, i have used MG-RAST earlier for 454 data, i am not sure how it will react to small reads and also i am not sure if it can handle 300 million reads.

        @MadsAlbertsen, i will surely read the paper, as far as i know they used SOAP assembly for their analysis.

        Comment

        • cliffbeall
          Senior Member
          • Jan 2010
          • 144

          #5
          I have been doing some similar work, though I don't have as much data. One thing we have been doing is finding the 16 and 23S sequences, using blat and various rDNA databases. That is pretty good in identifying what is there at the genus level, and there are a large number of such reads since rDNA is ~0.1% of the genome.

          I tried assembly with SOAPdenovo - in my case it worked well on a mock community of 10 species but less well on the real sample (which I expect for lack of data). I think a question is how much you can trust the assembly, and how much you can confirm it.

          I am curious why you think blastx is not a viable approach - is it lack of computing resources? I have seen claims of increased speed with different software and hardware, not sure if anyone has direct experience.

          Comment

          • ssharma
            Member
            • Oct 2010
            • 19

            #6
            @cliffbeall,
            thanks for your input. Actually finding rRNA is not a problem, i've made a small rRNA representative database and its doing pretty good job in removing rRNA (via blast).
            Even i tried assembly (velvet) but i don't trust it that much with such a diverse environmental data, but surely i will give a shot to SOAPdenovo (heard a lot about it).
            yes you are right, blastx is not viable because of large amount of data. I have computing resources but still blasting around 300 million reads will take quit a time.
            I am still working on finding the best procedure (most of the people voted for assembly).

            Comment

            • MadsAlbertsen
              Member
              • Aug 2010
              • 26

              #7
              Have you considered using the USEARCH package instead of BLAST? It might make it possible to do large scale database search?

              rgds
              MA

              Comment

              • ssharma
                Member
                • Oct 2010
                • 19

                #8
                @MA,
                Yes i considered using Ublastx but it has a paid license to get a 64 version, it is going to be expensive if i install it on the clusters

                Comment

                Latest Articles

                Collapse

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, Yesterday, 11:58 AM
                0 responses
                10 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                25 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                35 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 12:03 PM
                0 responses
                58 views
                0 reactions
                Last Post SEQadmin2  
                Working...