Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Finding exon-exon junction

    Hi,

    I have a list of around 1000 peptides and I want to find which one of those might have come from a gene coded by an exon-exon junction.

    I shall be thankful if somebody can help me figure this out as I am not sure of a precise way to perform this task.

    Vince

  • #2
    try wise2: http://www.ebi.ac.uk/Tools/Wise2/index.html

    Comment


    • #3
      Another option is to compare them to a database of peptides corresponding to exon-exon junctions. For example, these are made available as part of a recent publication here:

      ALEXA-Seq downloads

      For human genes, the files are available for two versions of the genome here: hg18 and hg19

      Each known or hypothetical junction corresponds to an Ensembl exon.

      Comment


      • #4
        Hi,

        Thanks for the information. Although I forgot to mention
        that these are sequences for a fungi Pichia Pastoris.

        The genomic information is available at NCBI, I am planning to use the algorithm
        and download the NCBI genome and try.

        Do you think this is the correct way of doing it for this specific organism?

        Thanks,
        Vince

        Comment


        • #5
          Yes, the option suggested by liux should work for you then. If you are only concerned with identifying matches to known genes, you could also compare your list of peptides to the known ORFeome of your species (say using blastp). Or perhaps a six-frame translation of the transcriptome (say using tblastn) if you do not want to figure out the actual ORFs.

          Comment


          • #6
            Hi,

            Thanks for being patient, I am beginner in bioinformatics analysis

            All I am concerned is I have a set of peptides and I would like to know if they came from an exon-exon junction, which means there was a splicing event that took place as the coverage of the peptides were from more than one gene.

            If I use a tblastn to figure, can the results distinguish between the peptides
            which came entirely from one know gene and ones which came from a junction?

            Liux and your reply seems quite promising but if a blast can solve the problem I would prefer that, my mentor did not indicate specific tools to do this.

            Thanks again,
            Vince

            Comment


            • #7
              Originally posted by liux View Post
              I agree with that, Genewise can do that job.

              Comment


              • #8
                When one says 'splicing event' it is usually understood to mean the joining of exons of a single gene. This is how a pre-messenger RNA becomes a mature messenger RNA. There are certainly peptides that correspond to the junctions of adjacent exons in a gene.

                Based on your last post, it sounds like you are talking about something else because you refer to peptides from "more than one gene". This is an important distinction because it influences the analysis approach you would take. Splicing can occur between different genes by a process called 'trans-splicing' although this is much less understood than constitutive splicing and alternative splicing. You may also be referring to a 'fusion gene'. These occur when the genome itself has been rearranged. For example, if a rearrangement happens and the break point is within an intron you can get a fused gene (some people call them chimeras) where exons from two different genes may get spliced together into a novel fusion transcript. Detecting these is practically an entire field of next generation sequencing analysis. If that is what you are trying to detect, then the analysis approach would have to be altered.

                Perhaps we should back up slightly to understand the goal more clearly. What is the nature or your data? How was it generated?

                I would also suggest that you quickly read up about RNA splicing, trans-splicing, and fusion genes. Which of these (if any) are you interested in?.

                Comment


                • #9
                  @malachig

                  Yes I should step back a little and try to focus on the goal. I am asked to find out from a list of peptide data obtained from Mass Spec, if these peptides span more than one exon ie . they are from exon-exon junctions. I may have got confused with alternate splicing.

                  Can you suggest a method, just to check which are likely to span more than one exon.?

                  Comment


                  • #10
                    Sounds like the original suggestion of wise2 would be the easiest. If you don't like wise2 or would like another option, any gapped aligner that accepts protein sequence should work. For example, Exonerate "will allow introns in the alignment, but also allow frameshifts, and exon phase changes when a codon is split by an intron". Instructions for using Exonerate for this purpose are here. You can obtain the genome sequence various places including at www.pichiagenome.org and bioinformatics.psb.ugent.be

                    If the alignment is reported as a single block, then the peptide likely does not span a junction. If you get a nice gapped alignment, and the boundaries look like valid splice sites, then you probably have a junction peptide. There is a caveat though. Gapped aligners require a reasonable amount of sequence on both sides of the junction to create an accurate gapped alignment. If your peptides are very short it may make this task difficult.

                    Also, if P. pastoris is like other members of the Saccharomycetaceae family (such as bakers yeast) it may have a relatively simple transcriptome. Many genes may consist of only a single exon and only a subset may actually have multiple exons. So peptides corresponding to junctions might be rare for that reason as well.... I'm not familiar with this species. Presumably, pertinent information is readily available in the genome paper for P. pastoris

                    Comment


                    • #11
                      Just noticed an alternative tool that might serve the same function as wise2 for this problem called 'ProSplign' that has recently become available (manuscript still in preparation according to the website). From the website:

                      ProSplign is a utility for computing the alignment of proteins to genomic nucleotide sequence. This alignment can include eukaryotic splicing. At the heart of the program is a global alignment algorithm that specifically accounts for introns and splice signals. It is due to this algorithm that ProSplign is accurate in determining splice sites and tolerant to sequencing errors.

                      ProSplign uses BLAST hits to identify possible locations of genes and their duplications on genomic sequences and then to speed up the core dynamic programming.

                      ProSplign was developed with the following goals in mind:

                      * Accuracy in determining splice signals
                      * Recognition of short exons and non-consensus splices where feasible
                      * Ability to identify and separate multiple compartments typically representing gene copying events

                      ProSplign is used to compute transcript alignments as a part of the NCBI Genome Annotation Pipeline.

                      Reference: ProSplign - Protein to Genomic Alignment Tool. B. Kiryutin, A. Souvorov, T. Tatusova. Manuscript in preparation

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM
                      • seqadmin
                        Techniques and Challenges in Conservation Genomics
                        by seqadmin



                        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                        Avian Conservation
                        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                        03-08-2024, 10:41 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 06:37 PM
                      0 responses
                      11 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, Yesterday, 06:07 PM
                      0 responses
                      10 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-22-2024, 10:03 AM
                      0 responses
                      51 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-21-2024, 07:32 AM
                      0 responses
                      68 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X