Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variant Effect Predictor Line Count

    Hi all,
    I am generating a VCF file and then running the Variant Effect Predictor (VEF) tool on it. This is in return giving me a new text file with the a list of variants and its potential effects. In one of the columns is the effect such as intronic change, exonic etc.

    I wanted to get a line count for the total number of lines which have say for e.e exonic in a certain coulmn of the text file. Ideally, i would like to have a list of all variations and the number of lines for e.g.

    Exonic = 200
    Intronic = 600
    ...
    ...

    If that's too complicated then i could simply have a single entity and run it multiple times.

    Thanks in advance.
    A

  • #2
    grep -v \# VEP_Annotation_File.ann | awk '{print $14}' | awk '{count[$1]++} END {for(j in count) print count[j], j}' | sort -nr

    Works for me - but we have a modified VEP so I'm not sure the column name ($14) is the same in your case.

    Output:

    410 INTRONIC
    277 DOWNSTREAM
    138 UPSTREAM
    119 3PRIME_UTR
    99 WITHIN_NON_CODING_GENE,INTRONIC
    51 INTERGENIC
    46 NMD_TRANSCRIPT,INTRONIC
    42 REGULATORY_REGION
    28 WITHIN_NON_CODING_GENE
    24 NON_SYNONYMOUS_CODING
    15 5PRIME_UTR
    9 SPLICE_SITE,INTRONIC
    4 SYNONYMOUS_CODING
    3 NMD_TRANSCRIPT,3PRIME_UTR
    3 ESSENTIAL_SPLICE_SITE
    2 NMD_TRANSCRIPT,SYNONYMOUS_CODING
    2 CODING_UNKNOWN
    1 STOP_GAINED
    1 SPLICE_SITE,WITHIN_NON_CODING_GENE,INTRONIC
    Last edited by Bukowski; 08-07-2012, 01:04 AM.

    Comment


    • #3
      worked

      hi there,
      thank you, thats awesome, it worked.

      In one of the columns the chromosomal location is mentioned as 1:1000 (for e.g.), can this script be slightly tweaked such that i can get a list based for each chromosome, independant upon the consequence type.

      For e.g.

      chromosome 1 number of consequences
      chromosome 2 number of consequences

      thanks again.
      a
      Last edited by ashkot; 08-07-2012, 03:48 PM.

      Comment


      • #4
        You could just split your input file up into chromosomes and run it over each one couldn't you? It's all trivially achieved with a bit of shell scripting.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        29 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        31 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        52 views
        0 likes
        Last Post seqadmin  
        Working...
        X