Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to split BED file according to chromsome

    Does anyone know a program which can split BED file according to the chromosome? I have generate a BED file which contains the data for all chromosome, but it is not sorted. When I did sorting using BedSort, the output was not ordered according the numeric order, it always give chr10 on the top and then followed chr11, up to chr19. It seems I have to do the sorting for each chr respectively, I wonder whether there is a program which can split BED file according to the chromosome. Thanks

  • #2
    You could try the following with your bed file:

    Code:
    sort -k 1V,1 -k 2n,2 file.bed -o file.sorted.bed
    if you want to split your bed file you could do with bash:

    Code:
    mkdir -p split_results
    for chr in `cut -f 1 file.bed | sort | uniq`; do
                    grep -w $chr file.bed > split_results/$chr.output.bed
    
    done

    Comment


    • #3
      An alternative:
      Code:
      awk '{close(f);f=$1}{print > f".bed"}'

      Comment


      • #4
        Similar to adamdeluca's suggestion, here is another simple awk solution. Note that the ">>" creates and appends to files named CHROM.bed, where CHROM is column 1 of the bed input bed file (in this case, example.bed).

        So, in plain English, the awk command prints each entire line ($0) from example.bed to distinct files that are each named by the chrom field ($1).

        This strategy is useful in many other cases where you want to do a context-based "grep", and route the results to distinct files.

        Code:
        $ awk '{print $0 >> $1".bed"}' example.bed
        
        $ ls -1 *.bed
        chr1.bed
        chr2.bed
        ... (snip)
        chrY.bed
        example.bed
        arq

        Comment


        • #5
          Thank you !

          Many thanks to you guys! I have worked it out.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM
          • seqadmin
            The Impact of AI in Genomic Medicine
            by seqadmin



            Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
            02-26-2024, 02:07 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 03-14-2024, 06:13 AM
          0 responses
          34 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-08-2024, 08:03 AM
          0 responses
          72 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-07-2024, 08:13 AM
          0 responses
          81 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-06-2024, 09:51 AM
          0 responses
          68 views
          0 likes
          Last Post seqadmin  
          Working...
          X