Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • nimmi
    Member
    • Jul 2010
    • 15

    bfast localalign error

    Hi,

    I am trying to map exome data using bfast. I received *.sam files from our collaborates. Here's a summary of my method:

    1. Converted *.sam to *.fastq using PICARD
    2. Converted hg19.fa to color space - hg19.fa.cs.brg
    3. Created index for reference - hg19.fa.cs.1.1.bif
    4. Search indexes successful - *.bmf
    5. Perform local alignment - gave me an error! Thanks to Seqanswers forums, I realized that I have to convert reference to nucleotide space as well. So, now I have hg19.fa.nt.brg
    6. Tried to run local alignment again.....but got "AlignColorSpaceGappedConstrained" error!!!

    ************************************************************
    Checking input parameters supplied by the user ...
    Validating fastaFileName chr_all.fa.
    Validating matchFileName03C14605A_bfastMatches_hg19.bmf.
    **** Input arguments look good! *****
    ************************************************************
    ************************************************************
    Printing Program Parameters:
    programMode: [ExecuteProgram]
    fastaFileName: chr_all.fa
    matchFileName: 03C14605A_bfastMatches_hg19.bmf
    scoringMatrixFileName: [Not Using]
    ungapped: [Not Using]
    unconstrained: [Not Using]
    space: [Color Space]
    startReadNum: 1
    endReadNum: 2147483647
    offsetLength: 20
    maxNumMatches: 384
    avgMismatchQuality: 10
    numThreads: 1
    queueLength: 10000
    timing: [Not Using]
    ************************************************************
    ************************************************************
    Reading in reference genome from chr_all.fa.nt.brg.
    In total read 25 contigs for a total of 3095693983 bases
    ************************************************************
    ************************************************************
    Reading match file from 03C14605A_bfastMatches_hg19.bmf.
    ************************************************************
    Performing alignment...
    Reads processed: 0************************************************************
    In function "AlignColorSpaceGappedConstrained": Fatal Error[OutOfRange]. Message: read and reference did not match.
    ***** Exiting due to errors *****
    ************************************************************


    Thank you in advance for your help.

    Nimmi
  • drio
    Senior Member
    • Oct 2008
    • 323

    #2
    Can you show the files from within the reference directory as well as the exact commands you run?

    Is it possible you forgot to use -A1 to tell bfast to compute the local alignments in CS?
    -drd

    Comment

    • nimmi
      Member
      • Jul 2010
      • 15

      #3
      Here are the list of files in my directory:

      03C14605A.fastq
      chr_all.fa
      chr_all.fa.cs.brg
      chr_all.fa.cs.1.1.bif
      chr_all.fa.nt.brg
      03C14605A_bfastMatches_hg19.bmf

      Here's the command I used for local alignment:

      $/usr/local/bfast/bin/bfast localalign -f chr_all.fa -m 03C14605A_bfastMatches_hg19.bmf -A 1
      > 03C14605A_bfastMatches_hg19_localalign.baf

      Thank you very much

      Comment

      • nilshomer
        Nils Homer
        • Nov 2008
        • 1283

        #4
        Looks like the fastq is not in the proper format. You need to extract the color sequence and color qualities from the SAM file in the CS/CQ optional tags.

        Comment

        • nimmi
          Member
          • Jul 2010
          • 15

          #5
          Thanks Nils for your response. Here's a few lines from my *.sam file:

          1462_1719_1204_F3 0 chr1 10006 0 50M * 0 0 CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCT
          BABBBB@ABB=BAAB@<BBB;@=BB?;A@B><7=7B>;>@9B@<97+B84 NH:i:508 CS:Z:T22301002301002301002301002301002301002301002301002
          1649_965_405_F3 16 chr1 10014 0 50M * 0 0 AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA +9==%'4587'1+:7@28957<5=7;6@2:=BAB:<?@AB<B<AA?BB@@ NH:i:511 CS:Z:T00320010320010320010320010320010320010320010300010
          1433_953_927_F3 16 chr1 10026 0 50M * 0 0 AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA '%:7+(6-;>%:4(<<8:=-:@268%>>644)::7>9;>B@@2?@A@B?A NH:i:511 CS:Z:T00320010320010320010320010320010320010320010320010
          1326_1211_1996_F3 0 chr1 10029 0 50M * 0 0 CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC
          <96BA6>7<?>/;6<?1482:41-)+1+466-+4752/6/8!;11/-)>1 NH:i:517 CS:Z:T20230100230100230100230100230100230100230100230100
          1379_1501_1111_F3 16 chr1 10047 0 50M * 0 0 CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC
          :%6=%!-2+7!%==<)!:1:4+)--8B6914:>1:/46:+;=:=@<</'> NH:i:517 CS:Z:T10010320010320010320010320010320010320010320010320
          1608_1356_225_F3 0 chr1 10048 0 50M * 0 0 CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCT
          ?>B>@@:AB9>=?><@6B?8<><B;<;;<B</<?4@:5929B56/2)@@) NH:i:508 CS:Z:T22301002301002301002301002301002301002301002301002

          Comment

          • nilshomer
            Nils Homer
            • Nov 2008
            • 1283

            #6
            If you have fragment reads, then the following perl code should do it. Pipe the SAM file into it (or use "samtools view <in.bam>" for a BAM file). For paired end or multi-end reads, it does get trickier and the below will not keep the pairings. Note that your input SAM file does not have the CQ tag, and thus color qualities are lost.


            Code:
            #!/bin/perl
            use strict;
            use warnings;
            while(defined(my $line = <STDIN>)) {
                if($line =~ m/^@/) {
                    next;
                }
                chomp($line);
                my $NAME = $line;
                my $CS = $line;
                my $CQ = "";
                $NAME =~ s/^([^\t]+)\t.*/$1/;
                $CS =~ s/.*CS:Z:([^\t]+).*/$1/;
                $CS =~ tr/ /4/;
                if($line =~ m/CQ:Z:([^\t]+)/) {
                    $CQ = $1;
                }   
                else {
                    $CQ = "!" x (length($CS)-1); 
                }   
                print STDOUT "\@$NAME\n$CS\n+\n$CQ\n";
            }
            Nils

            Comment

            • nimmi
              Member
              • Jul 2010
              • 15

              #7
              Thanks Nils for the perl script.

              Just to summarize........I converted *.sam file to *.fastq using PICARD and this fastq file was in nucleotide space. When I ran localalign in bfast I got the "AlignColorSpaceGappedConstrained" error. Then I used perl script posted by Nils to convert *.sam to *.fastq in colorspace. Then re-ran the match command (*.bmf) followed by localalign and everything works fine!

              Thank you all for your suggestions and responses

              Comment

              Latest Articles

              Collapse

              • SEQadmin2
                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                by SEQadmin2


                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                ...
                06-02-2026, 10:05 AM
              • SEQadmin2
                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                by SEQadmin2


                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                Introduction

                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                05-22-2026, 06:42 AM
              • SEQadmin2
                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                by SEQadmin2

                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                05-06-2026, 09:04 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Yesterday, 08:59 AM
              0 responses
              12 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 12:03 PM
              0 responses
              21 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 11:40 AM
              0 responses
              18 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 05-28-2026, 11:40 AM
              0 responses
              31 views
              0 reactions
              Last Post SEQadmin2  
              Working...