Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • transcriptome -> predicted peptide database

    can anyone recommend a pipeline to process transcriptome data into a predicted tryptic peptide database?

    i.e. we want to do some LC-MS/MS and iTRAQ-MS/MS with our organism however the only sequence data available for it are from our own 454/illumina sequencing.

    Cheers,
    Paul

  • #2
    What software are you planning to use to analyze the mass spec data, Mascot, x!Tandem (GPM)? Do you have the transcriptome data assembled into contigs?

    We have used contigs assembled from 454 cDNA data directly in Mascot, you don't need to do anything. Just load the fasta file containing your contigs (as DNA sequence) and Mascot takes care of the rest, translation in all six frames, spectral prediction based on the experimental parameters provided (e.g. digestion method). Of course Mascot is a pricey commercial product but worth it if doing lots of proteomics (no I'm not associated with them). The Global Proteome Machine (GPM) is a free (as in speech and beer) alternative to Mascot. I haven't worked with x!Tandem/GPM for quite some time but I imagine it could hand this type of reference file as easily as Mascot

    Comment


    • #3
      I am not sure exactly what you are looking for. It wouldn't be hard to write code to do the 6 frame translation. But you might need more than that depending on what search algorithm you are using. As kmcarr mentioned some search algorithms take fasta files just fine, even DNA sequence. Others require additional files. Mascot is a fine search engine but if you need a free alternative OMSSA works quite well and in our hands gives similar results to Mascot. However it requires additional files for searching. But they can be generated from a fasta file. You probably also need to make a concatenated target/decoy database so you can accruately determine FDR. If you are using OMSSA you could use COMPASS which can make all of the required files for OMSSA searching including the target-decoy database. It also has tools for doing FDR filtering and iTRAQ quantitation. Full disclosure, I was involved in developing COMPASS. But if you go that route and need help send me a message.

      Since you are dealing with transcriptome data the 6 frame translation approach seems reasonable. But its definitely a bad idea with whole genome data. Your search space will be large and you will end up getting much fewer IDs at a fixed false discovery rate. Just something to be aware of.
      Doug
      www.sharedproteomics.com

      Comment


      • #4
        the emboss suite can do 6-frame translation:



        As well as tryptic digest predictions:



        Both these programs work with multiple sequences (as FASTA) input. EMBOSS is very easy to install on a debian/ubuntu-like system (e.g. install the 'emboss-explorer' package, then visit http://localhost/emboss-explorer/). There are also a few places that have a publicly-accessible emboss installation.

        Comment


        • #5
          Hi Seqasaurus,

          What I want to add is that you seem to have a need to de novo assemble the reads, too. All you needs may be implemented with publicly available tools, depending on your internal bioinformatic capabilities and project timeline. Or commercial tools help you too if you want the results faster; commercial tools usually come with technical support.

          Best regards,
          Douglas

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          31 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          32 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          28 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          53 views
          0 likes
          Last Post seqadmin  
          Working...
          X