Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GATK newbie question

    I'm trying to run the GATK Unified Genotyper on a set of bam files that I have coordinate sorted, but when I run the UG its giving me this error message:

    "Input files reads and reference have incompatible contigs: Order of contigs differences, which is unsafe."

    From reading, I believe I need to re-order my reference fasta file to match the order apparent in the coordinate sorted header of my bam files. But I'm not sure how to do that. Is there code somewhere that will let me re-order my reference file to match a given bam file's order?

    I'm using a reference that comprises ~5k small genomes, some of which are in pieces (~188k total sequence records). The file size is 7.3Gb.

    I also think that maybe (probably?) I need to pull out single specific references and run the UG on single references at a time. Its a metagenomic project, and I was hoping to get results for the whole thing at one time, but that might not be realistic. But even if I pull out single, well covered genome references, some of them will be in hundreds of pieces themselves. So I'd still need a way to order my reference. I could probably write up something in perl to do this, but I'm not too strong a coder, and I'm worried that I'd have memory issues trying to hash 188k sequences and juggle them around.

    Can anyone offer me some guidance on this?

  • #2
    Is this what you are looking for? Picard - Reorder sam


    You can reorder your reads to match the order of your reference sequence.

    Comment


    • #3
      I will take a look at that, thanks for the suggestion. I had thought my alignment sam file needed to be sorted in coordinate order, so I was kind of thinking that I needed to re-order my reference.

      I've been having more luck just extracting single references and working on one at a time though. When I do that the reference fasta is small enough that a simple perl script can order it. I may just give up on trying to call SNPs on my whole subject database in a single go.

      Comment


      • #4
        The Reordering can be done only after sorting - you will still have to first sort your reads (and reference) by coordinates. Guess that doesn't solve the reference sorting problem you are facing..

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Advancing Precision Medicine for Rare Diseases in Children
          by seqadmin




          Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
          12-16-2024, 07:57 AM
        • seqadmin
          Recent Advances in Sequencing Technologies
          by seqadmin



          Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

          Long-Read Sequencing
          Long-read sequencing has seen remarkable advancements,...
          12-02-2024, 01:49 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 12-17-2024, 10:28 AM
        0 responses
        25 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 12-13-2024, 08:24 AM
        0 responses
        42 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 12-12-2024, 07:41 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 12-11-2024, 07:45 AM
        0 responses
        42 views
        0 likes
        Last Post seqadmin  
        Working...
        X