Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question about BED format - chromStart and End

    I am a bit confused about the chromStart and chromEnd positions in the BED format.

    According to UCSC:
    chromStart - The starting position of the feature in the chromosome or scaffold. The first base in a chromosome is numbered 0.
    chromEnd - The ending position of the feature in the chromosome or scaffold. The chromEnd base is not included in the display of the feature. For example, the first 100 bases of a chromosome are defined as chromStart=0, chromEnd=100, and span the bases numbered 0-99.

    Assuming I download a bed file for a gene from UCSC as below:

    chromStart: 300
    chromEnd: 500

    Now, I get a set of SNPs by mapping reads to hg18 and calling SNPs using whatever SNPcaller. I want to know how many SNPs were called within the above gene. Should I compare each SNP position with the gene range as

    300<=SNP_POSITION<=500

    or

    301<=SNP_POSITION<=500

    or

    301<=SNP_POSITION<=499

    ?

    Does anyone know which is correct?

    Thanks

  • #2
    300 <= x < 500

    so the first base is no. 300, the last base is no .499, and the range covers 200 bases.
    Just as the doc says. You may imagine the mark being between the bases.

    Comment


    • #3
      301st<=snp_position<=500th

      EDIT:

      BED is always 0-based. The first base in a sequence has coordinate 0 and therefore coordinate 300 denotes the 301st base. A more obvious example is

      0 1

      which denotes the first base.
      Last edited by lh3; 10-25-2010, 04:14 AM.

      Comment


      • #4
        It depends on how the range is defined: 0 based or 1 based positions? If one based, then 301 <= SNP_POSITION <= 500 is the range you had in the bed file.
        Last edited by Hena; 10-25-2010, 03:43 AM.

        Comment


        • #5
          300 <= x < 500

          Comment


          • #6
            Originally posted by lh3 View Post
            301st<=snp_position<=500th

            EDIT:

            BED is always 0-based. The first base in a sequence has coordinate 0 and therefore coordinate 300 denotes the 301st base. A more obvious example is

            0 1

            which denotes the first base.
            In case that everything is zero-based one has for example for range 3 (instead of 300) to 5 (instead of 500):


            0123456789 <-- positions
            ---gg----- <-- gene in range [3,5)


            therefore the correct answer is 300<=x<500!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Today, 08:47 AM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            59 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            54 views
            0 likes
            Last Post seqadmin  
            Working...
            X