Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • casava 1.8 bam conversion to gatk bam

    Am am trying to convert bam files generated via casava to bam files recognized by gatk...I have asked illumina but no solution yet. I am using casava 1.8 to align and generate bam files(one bam file for all chr's). Illumina gave me a script which was supposed to work for conversion, but as far as I can see produces the same file as that when using a combination of picard reordersam + picard addorreplace, to add read groups and get correct contig order . This seems to have succeeded in terms of contig order and adding read groups, however, using GATK (DepthOfCoverage) on this 'converted' bam file I now get the error message:##### ERROR MESSAGE: Badly formed genome loc: Contig 1_gl000191_random given as location, but this contig isn't present in the Fasta sequence dictionary.

    I have used the same ref fasta file in casava alignment as I have pointed gatk to (and picard when relevant), thus I don't understand why these additional/incompatible contigs are appearing. Perhaps workflow is dependant on a particualr reference file for alignment other than the one I am using (I am using human_g1k_v37.fasta reccommeded by gatk)?????

    Any help would be greatly appreciated
    Last edited by kingsalex; 02-14-2012, 05:28 AM. Reason: clarity

  • #2
    Have you generated the dict file from your fasta? Make sure the fasta does not have empty lines between the chromosomes.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      04-22-2024, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    59 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    57 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    51 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    56 views
    0 likes
    Last Post seqadmin  
    Working...
    X