Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • problem with auto_annovar

    I am trying to analyze some exome data using auto_annovar program. It works brilliantly till step 8. After Step-8, it generates a file but then it stop. During the step-8 processing, it says "fGrep: Writing output" and at the end the final message is "fGrep : Write error". Step-8 file contains the variants but with no gene names.
    i am not able to map the variants to the relevant genes. I checked my Humandb database, it seems to contain all the relevant files. Can anyone please kindly help me with this. Thanks a lot

  • #2
    Hello,

    I am experiencing an issue with auto_annovar script too.
    I am using annovar for a few months. Until now, I used annotate_variation.pl script, with different parameters separately, -geneanno, -regionanno, -filter,...
    Now, I would like to try auto_annovar script, but I am stuck at the first step:

    perl auto_annovar.pl -model recessive $dir/$file humandb -build hg19 -step 1
    NOTICE: the --ver1000g argument is set as '1000g2010nov' by default
    Error: the required database file humandb/hg19_ALL.sites.2010_11.txt does not exist. Please download it via -downdb argument by annotate_variation.pl.
    I cannot find what this hg19_ALL.sites.2010_11.txt database is... I check the script and it is not mentioned in it.
    I try to donload it:
    Code:
    perl annotate_variation.pl -downdb hg19_ALL.sites.2010_11.txt humandb -build hg19
    NOTICE: Downloading annotation database ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/hg19_ALL.sites.2010_11.txt.txt.gz ... ^[[AFailed
    WARNING: Some files cannot be downloaded, including ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/hg19_ALL.sites.2010_11.txt.txt.gz
    
    perl annotate_variation.pl -downdb ALL.sites.2010_11.txt humandb -build hg19
    NOTICE: Downloading annotation database ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/ALL.sites.2010_11.txt.txt.gz ... Failed
    WARNING: Some files cannot be downloaded, including ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/ALL.sites.2010_11.txt.txt.gz
    Does someone know what is this file and where I can find it?

    Thanks,
    Jane

    ps: I am using the version of March 2012.

    Comment


    • #3
      Try to include "-webfrom annovar" when downloading. Also, try "1000g2010nov" instead of "hg19_ALL.sites.2010_11.txt".

      Comment


      • #4
        Thank you for your answer, your remark put me on the right way: I have downloaded 1000g2012apr and not 1000g2010nov, so I only needed to change the name inside the script.

        I managed to run the three first steps, but I don't really understand the outputs. I get files concerning almost all the steps: step4, 7, 8, 9, I don't know why, maybe they are here by defalut. Moreover, I cannot open the file .genelist, which seems to be the final file

        perl auto_annovar.pl -model recessive $dir/$file humandb -build hg19 -step 1-3
        NOTICE: the --ver1000g argument is set as '1000g2012apr' by default

        NOTICE: Running step 1 with system command <perl annotate_variation.pl -geneanno -buildver hg19 -dbtype refgene -outfile GAR/mutect_file.step1 GAR/mutect_file humandb>
        NOTICE: Reading gene annotation from humandb/hg19_refGene.txt ... Done with 40514 transcripts (including 6759 without coding sequence annotation) for 23468 unique genes
        NOTICE: Reading FASTA sequences from humandb/hg19_refGeneMrna.fa ... Done with 63 sequences
        WARNING: A total of 271 sequences will be ignored due to lack of correct ORF annotation
        NOTICE: Finished gene-based annotation on 60 genetic variants in GAR/mutect_file
        NOTICE: Output files were written to GAR/mutect_file.step1.variant_function, GAR/mutect_file.step1.exonic_variant_function

        NOTICE: Running step 2 with system command <perl annotate_variation.pl -regionanno -dbtype mce46way -buildver hg19 -outfile GAR/mutect_file.step2 GAR/mutect_file.step2.varlist humandb>
        NOTICE: Reading annotation database humandb/hg19_phastConsElements46way.txt ... Done with 5163775 regions
        NOTICE: Finished region-based annotation on 26 genetic variants in GAR/mutect_file.step2.varlist
        NOTICE: Output files were written to GAR/mutect_file.step2.hg19_phastConsElements46way

        NOTICE: Running step 3 with system command <perl annotate_variation.pl -regionanno -dbtype segdup -buildver hg19 -outfile GAR/mutect_file.step3 GAR/mutect_file.step3.varlist humandb>
        NOTICE: Reading annotation database humandb/hg19_genomicSuperDups.txt ... Done with 51599 regions
        NOTICE: Finished region-based annotation on 24 genetic variants in GAR/mutect_file.step3.varlist
        NOTICE: Output files were written to GAR/mutect_file.step3.hg19_genomicSuperDups

        NOTICE: Running step 8 with system command <fgrep -f GAR/mutect_file.step8.varlist GAR/mutect_file.step1.exonic_variant_function | cut -f 2- > GAR/mutect_file.step8;cut -f 3- GAR/mutect_file.step8 > GAR/mutect_file.step8.temp;fgrep -v -f GAR/mutect_file.step8.temp GAR/mutect_file.step8.varlist > GAR/mutect_file.step8.temp1;fgrep -f GAR/mutect_file.step8.temp1 GAR/mutect_file.step1.variant_function >> GAR/mutect_file.step8;>

        NOTICE: a list of potentially important genes and the number of variants in them are written to GAR/mutect_file.genelist
        NOTICE: Consider filter out the list of dispensable genes from the GAR/mutect_file.genelist file to identify the final candidate gene list.
        What do you get and is the .genelist file the final result?

        Comment


        • #5
          I don't actually use the auto_annovar script, I just use the summarize.pl script and do my own filtering from that, so I'm not sure exactly what you should get.

          Comment


          • #6
            Ok, thanks. I will try summarize.pl script since I don't really want to follow all the steps of auto_annovar.pl.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            24 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            25 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            21 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            52 views
            0 likes
            Last Post seqadmin  
            Working...
            X