Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combine annotations

    Hi all...could you suggest me a way to combine annotations coming from different files and related to the same subject (e.g. the same genomic coordinate) ? I was thinking about Annovar (tool to annotate nucleotide variants, indels, etc.) which produces a lot of files...and usually I use Excel to combine all of this annotations in one final document containing all the informations.
    This method works, but it takes me a lot of time....
    Thank you!

    Another question: how can I annotate a BED file containing a list of genomic coordinates, with corrisponding Gene Symbols ? I tried using the UCSC genome browser (tables section), but I was able to annotate just with ucsc gene names, and not with normal names (like Hugo symbol names). BEDTools can do it ? thanks!

  • #2
    Annovar has a summarize_annovar.pl script you can take a look at...

    Personally, I combine the annotations using my own scripts. Excel is definitely not that way to do it!! You're just going to end up making mistakes, plus Excel can't handle more than a handful of data sets.

    Comment


    • #3
      Yes...I know about summarize_annovar, but I was not able to get it working...because of an error. Today I tried to redownload some database, and now it works fine!
      Fortunately I have just some variant to annotate (we look just at some gene...it's not a big sequencing project..) and to do it I use a macro that I've written...and do the work automatically. But to combine all the input file to process with my macro, it takes time....so i was wondering if you know other tools that do it..however thank you for the suggestions!

      What about my second question (to annotate a bed file with hgnc gene symbols) ?! :-)


      Originally posted by jxchong View Post
      Annovar has a summarize_annovar.pl script you can take a look at...

      Personally, I combine the annotations using my own scripts. Excel is definitely not that way to do it!! You're just going to end up making mistakes, plus Excel can't handle more than a handful of data sets.

      Comment


      • #4
        The other tool would be writing a script yourself to do it (usually Perl or Python)


        Comment


        • #5
          Originally posted by Liam_Gallagher View Post
          Another question: how can I annotate a BED file containing a list of genomic coordinates, with corrisponding Gene Symbols ? I tried using the UCSC genome browser (tables section), but I was able to annotate just with ucsc gene names, and not with normal names (like Hugo symbol names). BEDTools can do it ? thanks!
          BEDOPS is another suite of tools for manipulating BED data.

          You can use the bedmap tool to annotate genomic regions with IDs or other data from other sets (gene names, etc.).

          As an example, if you have regions in a sorted file called Regions.bed and your genes in a sorted file called Genes.bed (where gene IDs are in the fourth column, per UCSC specification), the file AnnotatedRegions.bed will contain your answer:

          Code:
          $ bedmap --echo --echo-map-id --delim '\t' Regions.bed Genes.bed > AnnotatedRegions.bed
          The only requirement is that the inputs are sorted. Use the sort-bed utility for this purpose, e.g.:

          Code:
          $ sort-bed UnsortedRegions.bed > SortedRegions.bed

          Comment


          • #6
            Originally posted by AlexReynolds View Post
            BEDOPS is another suite of tools for manipulating BED data.

            You can use the bedmap tool to annotate genomic regions with IDs or other data from other sets (gene names, etc.).

            As an example, if you have regions in a sorted file called Regions.bed and your genes in a sorted file called Genes.bed (where gene IDs are in the fourth column, per UCSC specification), the file AnnotatedRegions.bed will contain your answer:

            Code:
            $ bedmap --echo --echo-map-id --delim '\t' Regions.bed Genes.bed > AnnotatedRegions.bed
            The only requirement is that the inputs are sorted. Use the sort-bed utility for this purpose, e.g.:

            Code:
            $ sort-bed UnsortedRegions.bed > SortedRegions.bed
            Thank you very much for your suggestions....they are very helpful!!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            30 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            32 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Working...
            X