Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to find primers using biopieces?

    Hi, I'm not much of a bioinformatician, and I am trying to learn how to use biopieces, which our real bioinformatician left behind on our server. Mostly it seems straightforward, but I am stumped by how to generate a sequence file containing only sequences (forward or reverse) that contain a given primer sequence. The reason this is important is that multiple amplicons were mixed together and co-sequenced in lanes of the 454 run, and these amplicons cannot be separated by barcodes, but rather have to be separated based on the primer sequences.

    So after I use read_sff, should I look for the primer pattern using patscan_seq? If so, then how do I generate a fastq file containing only sequences containing the primers of interest- do I use write_fastq, and that will collect all the sequences identified by patscan_seq if I pipe them together? I'm a little unsure of how biopieces works.

    Thanks for your help,

    Liz

  • #2
    Try something like this:

    Code:
     read_sff -ci data_in.sff |
    patscan_seq -ip ATGATCAT[2,1,1] |
    grab -p MATCH |
    write_fastq -o data_out.fq -x
    This will read in the SFF data from the a file called data_in.sff and clip the sequence according to the clipping information in the file.

    Next the patscan_seq will find a primer sequence allowing for 2 mismatches and 1 insertion and 1 deletion and report the result per sequence record (the -i switch). You need to play around with the primer and the mismatch/indel information to get the best result. I would start with 14-16 bases from the 3'-end of the primer. Remember that patscan_seq is aware of ambiguity codes.

    Next we grab the records where the primer was found.

    Finally, you write the output to a FASTQ file.

    You need to do this for each primer.
    Last edited by maasha; 09-28-2012, 09:52 PM. Reason: Missed the grab part

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM
    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    17 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    22 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    16 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    46 views
    0 likes
    Last Post seqadmin  
    Working...
    X