Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error: tagBam call failed when running pe_utils.py --compute-insert-len

    Dear All:

    I am using MISO to test for differential exon usage between a control and a treatment group. I got an error when computing the insert length distribution using pe_utils.py --compute-insert-len. I list the steps I used below:

    1. sort the BAM file from TopHat (by coordinate):
    samtools sort control.bam control_sorted

    2. index the BAM file:
    samtools index control_sorted.bam control_sorted.bai

    3. run pe_utils.py:
    python pe_utils.py --compute-insert-len controlam /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff --output-dir /directories/insert-dist/

    After the command above, I got the error message:

    Preparing to call bedtools 'tagBam'
    tagBam -i control.bam -files /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff -labels gff -intervals -f 1 | samtools view - -h | egrep '^@|:gff:' | samtools view - -Shb -o /directories/insert-dist/bam2gff_Homo_sapiens.GRCh37.65.min_1000.const_exons.gff/control.bam
    [samopen] SAM header is present: 25 sequences.
    [sam_read1] reference 'ID:TopHat CL:/informatics/tools/Linux-AS5/bin/tophat -o Lane3 -g 1 --coverage-search --microexon -r 100 --phred64-quals --library-type fr-unstranded -p 4 -G gene_models/Homo_sapiens.GRCh37.72_norm.gtf --transcriptome-index=gene_models/transcripts /directories/Genomes/NCBI_Jul-09-2012/Human/bowtie/human_ref_genome Lane3_1.fq.gz Lane3_2.fq.gz VN:1.4.1
    ' is recognized as '*'.
    [main_samview] truncated file.
    Traceback (most recent call last):
    File "/pe_utils.py", line 520, in <module>
    main()
    File "pe_utils.py", line 517, in main
    sd_max=sd_max)
    File "pe_utils.py", line 271, in compute_insert_len
    output_dir)
    File "exon_utils.py", line 185, in map_bam2gff
    raise Exception, "Error: tagBam call failed."
    Exception: Error: tagBam call failed.

    I used Homo_sapiens.GRCh37.72_norm.gtf from Ensembl as the annotation file when preparing my data, but downloaded

    Human genome (hg19) alternative events v2.0

    from the MISO website and unzipped. I saw it is based on Homo_sapiens.GRCh37.65. Is this the version problem? If so, could anyone provide the latest GFF3 file for use? Thank you for your suggestions!

  • #2
    I'm actually having the same problem with MISO. I also haven't been able to find anything as to why this happens, so if anyone could give us some insight, that would be great.

    For now I'm trying to do the same thing using Bowtie and Picard-tools (outlined here: http://vinaykmittal.blogspot.ca/2012...or-paired.html ) but I'm sure it would be much easier using MISO's function...

    Comment


    • #3
      Originally posted by space_monkey View Post
      I'm actually having the same problem with MISO. I also haven't been able to find anything as to why this happens, so if anyone could give us some insight, that would be great.

      For now I'm trying to do the same thing using Bowtie and Picard-tools (outlined here: http://vinaykmittal.blogspot.ca/2012...or-paired.html ) but I'm sure it would be much easier using MISO's function...
      Hi @space_monkey:

      Here are some comments -- I contact the authors and gave my partial outputs, and the reply is below:

      It does look like a headers mismatch then. Your BAM file contains "chr" style chromosomes (e.g. "chr10" and not "10"). I believe your GFF, /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff, is from Ensembl which would not contain chr-prefixes. Just look in that gff file and see what the chromosome entries are like. If they don't have chr, the operation will fail. All you need to do is generate a constitutive exons file from a UCSC gff which contains chromosome headers that match your .bam file.

      See:



      If you're using hg19, you can use this GFF:



      Use our exon_utils program to generate constitutive exons from this file and then rerun pe_utils with that, instead of the GRCh37 gff file.

      I use ensGene.gff3 and it works. However, I still cannot get the results (not sure why...)

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      30 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      32 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Working...
      X