Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • create windows10Kb.bed

    Hi,

    I try to find a way to create a bedfile in order to compute the coverage of aligned sequences on 10 kilobase “windows” spanning the genome.

    chr1 0 10000
    chr1 10000 20000
    ..... .... ....

    Go to http://code.google.com/p/bedtools/wi...ge#coverageBed and clik "coveragebed"

    How create this 10Kb windows file please ? UCSC ? Table browser ?

    Thank you,

    Sam

  • #2
    up up up

    Comment


    • #3
      I would like to know this answer too. I have always done this by writing a little script but it would be great if it could be done with a bedtools command. Basically, you just need the length of each chromosome which I get from one of the files generated by bwa index (e.g. hg19.dict I think).

      Then I do something like this.

      grep -P "^@SQ" hg19.dict | awk 'BEGIN {OFS = "\t";} {split($2,chrom,":");split($3,ln,":"); i =10000; j=0; while (i < ln[2]){print chrom[2],i-10000,i; j=i; i += 10000;}; print chrom[2],j,ln[2];}' > hg19.10kbwindows

      I haven't double checked to see if this works correctly but if it doesn't something like this should do the trick.
      Doug
      www.sharedproteomics.com

      Comment


      • #4
        Have a look at the "bedtools makewindows" command in bedtools v2.15.0.

        Examples:
        Code:
         # Divide the human genome into windows of 1MB:
         $ bedtools makewindows -g hg19.txt -w 1000000
         chr1 0 1000000
         chr1 1000000 2000000
         chr1 2000000 3000000
         chr1 3000000 4000000
         chr1 4000000 5000000
         ...
        
         # Divide the human genome into sliding (=overlapping) windows of 1MB, with 500KB overlap:
         $ bedtools makewindows -g hg19.txt -w 1000000 -s 500000
         chr1 0 1000000
         chr1 500000 1500000
         chr1 1000000 2000000
         chr1 1500000 2500000
         chr1 2000000 3000000
         ...
        
         # Divide each chromosome in human genome to 1000 windows of equal size:
         $ bedtools makewindows -g hg19.txt -n 1000
         chr1 0 249251
         chr1 249251 498502
         chr1 498502 747753
         chr1 747753 997004
         chr1 997004 1246255
         ...
        
         # Divide each interval in the given BED file into 10 equal-sized windows:
         $ cat input.bed
         chr5 60000 70000
         chr5 73000 90000
         chr5 100000 101000
         $ bedtools makewindows -b input.bed -n 10
         chr5 60000 61000
         chr5 61000 62000
         chr5 62000 63000
         chr5 63000 64000
         chr5 64000 65000
         ...
        
         # Add a name column, based on the window number: 
         $ cat input.bed
         chr5  60000  70000 AAA
         chr5  73000  90000 BBB
         chr5 100000 101000 CCC
         $ bedtools makewindows -b input.bed -n 3 -i winnum
         chr5        60000   63334   1
         chr5        63334   66668   2
         chr5        66668   70000   3
         chr5        73000   78667   1
         chr5        78667   84334   2
         chr5        84334   90000   3
         chr5        100000  100334  1
         chr5        100334  100668  2
         chr5        100668  101000  3
         ...
        
         # Add a name column, based on the source ID + window number: 
         $ cat input.bed
         chr5  60000  70000 AAA
         chr5  73000  90000 BBB
         chr5 100000 101000 CCC
         $ bedtools makewindows -b input.bed -n 3 -i srcwinnum
         chr5        60000   63334   AAA_1
         chr5        63334   66668   AAA_2
         chr5        66668   70000   AAA_3
         chr5        73000   78667   BBB_1
         chr5        78667   84334   BBB_2
         chr5        84334   90000   BBB_3
         chr5        100000  100334  CCC_1
         chr5        100334  100668  CCC_2
         chr5        100668  101000  CCC_3
         ...

        Comment


        • #5
          Excellent. Thanks. And thanks for providing the version number (it's not part of BEDTools 2.14.3). Time to upgrade.
          Doug
          www.sharedproteomics.com

          Comment


          • #6
            Bedtool makewindows

            Hello there,

            Part of this example doesn't work for me.
            Basically, makewindows does not recognize any of the -b and -n options.

            BWT, I checked the version of my bedtools and it is 2.15.0.

            Thanks,
            Robert

            Originally posted by quinlana View Post
            Have a look at the "bedtools makewindows" command in bedtools v2.15.0.

            Examples:
            Code:
             # Divide the human genome into windows of 1MB:
             $ bedtools makewindows -g hg19.txt -w 1000000
             chr1 0 1000000
             chr1 1000000 2000000
             chr1 2000000 3000000
             chr1 3000000 4000000
             chr1 4000000 5000000
             ...
            
             # Divide the human genome into sliding (=overlapping) windows of 1MB, with 500KB overlap:
             $ bedtools makewindows -g hg19.txt -w 1000000 -s 500000
             chr1 0 1000000
             chr1 500000 1500000
             chr1 1000000 2000000
             chr1 1500000 2500000
             chr1 2000000 3000000
             ...
            
             # Divide each chromosome in human genome to 1000 windows of equal size:
             $ bedtools makewindows -g hg19.txt -n 1000
             chr1 0 249251
             chr1 249251 498502
             chr1 498502 747753
             chr1 747753 997004
             chr1 997004 1246255
             ...
            
             # Divide each interval in the given BED file into 10 equal-sized windows:
             $ cat input.bed
             chr5 60000 70000
             chr5 73000 90000
             chr5 100000 101000
             $ bedtools makewindows -b input.bed -n 10
             chr5 60000 61000
             chr5 61000 62000
             chr5 62000 63000
             chr5 63000 64000
             chr5 64000 65000
             ...
            
             # Add a name column, based on the window number: 
             $ cat input.bed
             chr5  60000  70000 AAA
             chr5  73000  90000 BBB
             chr5 100000 101000 CCC
             $ bedtools makewindows -b input.bed -n 3 -i winnum
             chr5        60000   63334   1
             chr5        63334   66668   2
             chr5        66668   70000   3
             chr5        73000   78667   1
             chr5        78667   84334   2
             chr5        84334   90000   3
             chr5        100000  100334  1
             chr5        100334  100668  2
             chr5        100668  101000  3
             ...
            
             # Add a name column, based on the source ID + window number: 
             $ cat input.bed
             chr5  60000  70000 AAA
             chr5  73000  90000 BBB
             chr5 100000 101000 CCC
             $ bedtools makewindows -b input.bed -n 3 -i srcwinnum
             chr5        60000   63334   AAA_1
             chr5        63334   66668   AAA_2
             chr5        66668   70000   AAA_3
             chr5        73000   78667   BBB_1
             chr5        78667   84334   BBB_2
             chr5        84334   90000   BBB_3
             chr5        100000  100334  CCC_1
             chr5        100334  100668  CCC_2
             chr5        100668  101000  CCC_3
             ...

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin


              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
              Yesterday, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            39 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            41 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            35 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            55 views
            0 likes
            Last Post seqadmin  
            Working...
            X