Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TSS on "-" strand (Ensembl notation)

    Hello,

    I have a silly question with respect to finding a TSS position from the Ensembl annotation provided by Biomart, please correct me if I am wrong:

    They have the column "Gene start", which I guess corresponds to the TSS if the gene is on the "+" strand. They have the column "Gene end", which I guess corresponds to the TSS if the gene is on the "-" strand.

    Is it correct?

  • #2
    Yes, it's one of them. Briefly playing with Biomart yielded the outermost TSS for me (gene start/end). I assume you can get all of the TSSs (annotated in Ensembl at least) through Biomart, but if not, you can just parse the gtf file from Ensembl (http://www.ensembl.org/info/data/ftp/index.html).

    Comment


    • #3
      thank you!
      but what do you mean by " one of them"?

      basically, the thing that I want to check
      is whether the TSS of a gene in the "-"
      strand is marked as " Gene start", or "Gene end"?

      And is it the same for "Transcript start" and "Transcript end"?

      Comment


      • #4
        to put it even more explicit:

        is it true that what is marked as "gene start"
        is not the TSS, but just the smallest of the
        two values for the TSS and the 3' gene end?
        Because, looking at the values at Biomart,
        I see that the Gene start is always smaller than
        Gene end (which can not be if the gene is on the "-" strand.
        ?
        Last edited by rebrendi; 12-20-2011, 02:52 PM.

        Comment


        • #5
          The only change is that there are often multiple TSSs for a single gene.

          Comment


          • #6
            Originally posted by dpryan View Post
            The only change is that there are often multiple TSSs for a single gene.
            can they be found as
            "Transcript start" &
            "Transcript end"
            instead of "Gene start"
            and "Gene end"?

            Comment


            • #7
              and I still do not understand,
              what the "Gene start" means
              in this notation.
              It is not the real start of the gene?
              it is not necessarily the TSS?

              Comment


              • #8
                Originally posted by rebrendi View Post
                can they be found as
                "Transcript start" &
                "Transcript end"
                instead of "Gene start"
                and "Gene end"?
                It appears so. I checked a single gene that I happen to remember and it looks correct. You'll still have to take strand into account (though that's trivial).

                Comment


                • #9
                  yes, I just want to be sure that I understand that trivial thing correctly:

                  is it true that the TSS of the gene on the "-" strand
                  is marked as "Gene end"?

                  Comment


                  • #10
                    Originally posted by rebrendi
                    is it true that the TSS of the gene on the "-" strand
                    is marked as "Gene end"?
                    It is true that a TSS of a gene on the "-" strand is marked as "gene end". Look at, for example, Ing2, which has 3 transcript start sites.

                    Comment


                    • #11
                      thank you, I got the point.

                      Comment


                      • #12
                        something worth noting Ensembl coordinate conventions are always 5" to 3" regardless of strand so on reverse strand genes the 3" most coordinate is the first coordinate

                        Gene start and end represent the outer limits of that gene loci so the 5" most coordinate of one of its transcripts and the 3" most coordinate of one of its transcripts

                        TSS as you have already been told is best defined by the start coordinate of a particular transcript which is the 5" most coordinate for a forward strand gene and the 3" most coordinate for a reverse strand gene

                        Comment


                        • #13
                          It appears that the Ensembl annotation specifies "gene start" as the ORF start, but not as the TSS. For several manually checked genes the TSS does not coincide with the ORF start or end.

                          Could someone please comment on how to find the TSS coordinates best?

                          Comment


                          • #14
                            Gene Start in Ensembl is the 5" most coordinate of the transcripts that make up the gene. If none of the transcripts have UTRs this will be the 5" most ORF start though at least in Human and Mouse it is very rare for transcripts to have no UTR

                            In ensembl to get specific coordinates for a single transcript you should be looking at the transcript table again the position in the table is the 5" most position in the transcript

                            Comment


                            • #15
                              Laura, The output file that Biomart suggests me includes the following potentially interesting columns: "gene start", "transcript start", "gene end", "transcript end". It appears that none of these columns contains the TSS coordinate that I am looking for. Could you please suggest exactly where to look?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Advancing Precision Medicine for Rare Diseases in Children
                                by seqadmin




                                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                                12-16-2024, 07:57 AM
                              • seqadmin
                                Recent Advances in Sequencing Technologies
                                by seqadmin



                                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                                Long-Read Sequencing
                                Long-read sequencing has seen remarkable advancements,...
                                12-02-2024, 01:49 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 12-17-2024, 10:28 AM
                              0 responses
                              25 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 12-13-2024, 08:24 AM
                              0 responses
                              42 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 12-12-2024, 07:41 AM
                              0 responses
                              28 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 12-11-2024, 07:45 AM
                              0 responses
                              42 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X