Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Standalone blast output from paired end reads

    Hi Everyone,

    I have a question, which is probably simple. I'm new to this area though so I need a little help.

    I have paired end reads. I have taken the R1 files (not the R2, I hear I have to reverse complement those) and converted them to Fasta, and made a blast database. I used tblastn because I have an amino acid query and the paired reads are nucleotides. The output is huge, and I have no idea how to even interpret it. I have a tabular output.

    I'd just like to know if anyone has done a similar project and what to do with the blast outputs. Someone suggested to use Cap3 sequence assembler. The problem for me is working with the blast output and figuring out how to deal with it.

    Thank you!

  • #2
    Originally posted by sp24 View Post
    Hi Everyone,
    The output is huge, and I have no idea how to even interpret it. I have a tabular output.
    It lists all your hits. What's there to interpret? You can slim it down easily with simple unix commands, e.g. assuming you used default '-outfmt 6', the 4th column lists the alignment lengths. To only keep hits that produced ≥ 100 bp alignments, do:

    awk -F'\t' 'int($4)>=100' result.tsv > result_slimmed_down.tsv

    Want to see only the unique sequences that produced these hits?

    cat result_slimmed_down.tsv | cut -f 2 | sort -u > unique_hit_contig_ids.txt

    etc.


    p.s. I think you should delete your db, go read a few articles relative to paired-end reads (pay special attention to M&M), process your data properly, and only then build the db and run your blasts..
    Last edited by rhinoceros; 05-07-2013, 11:07 PM.
    savetherhino.org

    Comment


    • #3
      Thanks for your input. By interpret I just meant that once I'm done blasting, that's not the end of the process. I need to figure out a way to actually get the sequences out, maybe assemble them using cap3.

      I'm not sure if I'll be able to use that first command since I'm only using paired end reads, which are not that long to begin with.

      I did use that second command on my blast output and got the names, which will be useful because I can use the blastdbcmd command to extract the hits.

      Right now I'm just experimenting, I'm sure I will have to delete my database and start over anyway. I just wanted to have something to work with and practice blasting, and getting sequences out, and then actually doing something with those sequences. I'm not sure what M&M is though.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 11:49 AM
      0 responses
      15 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-24-2024, 08:47 AM
      0 responses
      16 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      61 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Working...
      X