Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RDP: confirming V3-V4 region

    Hi there,
    I'm working on RDP training set for genera of our interests. For that I download sequences from link .
    Next, I want to confirm if these sequences really have V3V4 in them, as data I receive are for V3-V4 region. I check this paper and go for E. coli used in paper: J01695.2 .Next I get 16s region from it using online rRNA region. Further I get V3V4 region from this paper.

    Until now data, sequences are all in hand. I BLAST them against this V3V4 region I get no Hit.
    When I BLAST this v3v4 region on NCBI database I get hits from E. coli only.

    How can I verify if the sequences I've have V3V4 in them? My understanding is/was that E. coli region is available in other genera as well. Or that is how the paper uses it.

    I'm unable to go ahead with BLAST approach. BLAST would have provided me quantitative evidence as to which sequences are good enough to go ahead or otherwise.

    Has anyone come across similar situation? How did you solve it?
    Looking for some pointers.
    Last edited by bio_informatics; 09-07-2016, 06:13 AM.
    Bioinformaticscally calm

  • #2
    What are you trying to do? Create a custom database? For what purpose? It's hard to answer your question without knowing why you are trying to do something.
    Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.

    Comment


    • #3
      Originally posted by thermophile View Post
      What are you trying to do? Create a custom database? For what purpose? It's hard to answer your question without knowing why you are trying to do something.
      Hi,

      Thanks for your reply.
      I'm trying to confirm if the downloaded sequences I've contain V3-V4 region in them of 16S rRNA gene.
      Create custom database to verify if my V3-V4 region is available in the sequences. If not I'm going to get rid of those sequences in custom RDP training set.

      I got over this: I used dc-megablast and set perc_identity to 20

      I've follow up question on BLAST results:-

      1) I want to limit number of hits against subject. As in Seq1 in query to have 4 hits in database and after that no more.

      Based upon my need I'm going for:

      blastn -query barcode.fasta -db complete_barcodes.txt -task dc-megablast -out full_blast_p20_max_target_seqs_4.txt -outfmt "6" -perc_identity 20 -max_target_seqs 4
      I've been through:





      Thank you again for your reply.
      Last edited by bio_informatics; 09-08-2016, 06:41 AM.
      Bioinformaticscally calm

      Comment


      • #4
        I wouldn't use blast for that. Align the sequences to a reference alignment (I like SILVA), then trim based on whatever primers you use to define v3v4

        Or if you really want to use blast, convert your downloaded RDP sequences into a blast db and query your v3v4 sequences against that database.
        Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.

        Comment


        • #5
          Thank you for your reply. I am unable to below work flow:
          Originally posted by thermophile View Post
          I wouldn't use blast for that. Align the sequences to a reference alignment (I like SILVA), then trim based on whatever primers you use to define v3v4
          1- Would I get any quantitative evidence?

          Originally posted by thermophile View Post
          Or if you really want to use blast, convert your downloaded RDP sequences into a blast db and query your v3v4 sequences against that database.
          Yes, this is exactly what I'm doing. BLAST v3-v4 region against my sequences.
          Last edited by bio_informatics; 09-08-2016, 07:43 AM.
          Bioinformaticscally calm

          Comment


          • #6
            Originally posted by bio_informatics View Post
            I got over this: I used dc-megablast and set perc_identity to 20
            Um, what? I've never used dc-megablast, but at that setting you would expect everything to match everything, no?

            Comment


            • #7
              Originally posted by Brian Bushnell View Post
              Um, what? I've never used dc-megablast, but at that setting you would expect everything to match everything, no?
              Hi brian,
              Yes, I am looking for whatever matches.
              Bioinformaticscally calm

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Advancing Precision Medicine for Rare Diseases in Children
                by seqadmin




                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                12-16-2024, 07:57 AM
              • seqadmin
                Recent Advances in Sequencing Technologies
                by seqadmin



                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                Long-Read Sequencing
                Long-read sequencing has seen remarkable advancements,...
                12-02-2024, 01:49 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 12-17-2024, 10:28 AM
              0 responses
              33 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-13-2024, 08:24 AM
              0 responses
              48 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-12-2024, 07:41 AM
              0 responses
              34 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-11-2024, 07:45 AM
              0 responses
              46 views
              0 likes
              Last Post seqadmin  
              Working...
              X