Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RDP: confirming V3-V4 region

    Hi there,
    I'm working on RDP training set for genera of our interests. For that I download sequences from link .
    Next, I want to confirm if these sequences really have V3V4 in them, as data I receive are for V3-V4 region. I check this paper and go for E. coli used in paper: J01695.2 .Next I get 16s region from it using online rRNA region. Further I get V3V4 region from this paper.

    Until now data, sequences are all in hand. I BLAST them against this V3V4 region I get no Hit.
    When I BLAST this v3v4 region on NCBI database I get hits from E. coli only.

    How can I verify if the sequences I've have V3V4 in them? My understanding is/was that E. coli region is available in other genera as well. Or that is how the paper uses it.

    I'm unable to go ahead with BLAST approach. BLAST would have provided me quantitative evidence as to which sequences are good enough to go ahead or otherwise.

    Has anyone come across similar situation? How did you solve it?
    Looking for some pointers.
    Last edited by bio_informatics; 09-07-2016, 06:13 AM.
    Bioinformaticscally calm

  • #2
    What are you trying to do? Create a custom database? For what purpose? It's hard to answer your question without knowing why you are trying to do something.
    Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.

    Comment


    • #3
      Originally posted by thermophile View Post
      What are you trying to do? Create a custom database? For what purpose? It's hard to answer your question without knowing why you are trying to do something.
      Hi,

      Thanks for your reply.
      I'm trying to confirm if the downloaded sequences I've contain V3-V4 region in them of 16S rRNA gene.
      Create custom database to verify if my V3-V4 region is available in the sequences. If not I'm going to get rid of those sequences in custom RDP training set.

      I got over this: I used dc-megablast and set perc_identity to 20

      I've follow up question on BLAST results:-

      1) I want to limit number of hits against subject. As in Seq1 in query to have 4 hits in database and after that no more.

      Based upon my need I'm going for:

      blastn -query barcode.fasta -db complete_barcodes.txt -task dc-megablast -out full_blast_p20_max_target_seqs_4.txt -outfmt "6" -perc_identity 20 -max_target_seqs 4
      I've been through:





      Thank you again for your reply.
      Last edited by bio_informatics; 09-08-2016, 06:41 AM.
      Bioinformaticscally calm

      Comment


      • #4
        I wouldn't use blast for that. Align the sequences to a reference alignment (I like SILVA), then trim based on whatever primers you use to define v3v4

        Or if you really want to use blast, convert your downloaded RDP sequences into a blast db and query your v3v4 sequences against that database.
        Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.

        Comment


        • #5
          Thank you for your reply. I am unable to below work flow:
          Originally posted by thermophile View Post
          I wouldn't use blast for that. Align the sequences to a reference alignment (I like SILVA), then trim based on whatever primers you use to define v3v4
          1- Would I get any quantitative evidence?

          Originally posted by thermophile View Post
          Or if you really want to use blast, convert your downloaded RDP sequences into a blast db and query your v3v4 sequences against that database.
          Yes, this is exactly what I'm doing. BLAST v3-v4 region against my sequences.
          Last edited by bio_informatics; 09-08-2016, 07:43 AM.
          Bioinformaticscally calm

          Comment


          • #6
            Originally posted by bio_informatics View Post
            I got over this: I used dc-megablast and set perc_identity to 20
            Um, what? I've never used dc-megablast, but at that setting you would expect everything to match everything, no?

            Comment


            • #7
              Originally posted by Brian Bushnell View Post
              Um, what? I've never used dc-megablast, but at that setting you would expect everything to match everything, no?
              Hi brian,
              Yes, I am looking for whatever matches.
              Bioinformaticscally calm

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              22 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              19 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              50 views
              0 likes
              Last Post seqadmin  
              Working...
              X