Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bfast localalign error

    Hi,

    I am trying to map exome data using bfast. I received *.sam files from our collaborates. Here's a summary of my method:

    1. Converted *.sam to *.fastq using PICARD
    2. Converted hg19.fa to color space - hg19.fa.cs.brg
    3. Created index for reference - hg19.fa.cs.1.1.bif
    4. Search indexes successful - *.bmf
    5. Perform local alignment - gave me an error! Thanks to Seqanswers forums, I realized that I have to convert reference to nucleotide space as well. So, now I have hg19.fa.nt.brg
    6. Tried to run local alignment again.....but got "AlignColorSpaceGappedConstrained" error!!!

    ************************************************************
    Checking input parameters supplied by the user ...
    Validating fastaFileName chr_all.fa.
    Validating matchFileName03C14605A_bfastMatches_hg19.bmf.
    **** Input arguments look good! *****
    ************************************************************
    ************************************************************
    Printing Program Parameters:
    programMode: [ExecuteProgram]
    fastaFileName: chr_all.fa
    matchFileName: 03C14605A_bfastMatches_hg19.bmf
    scoringMatrixFileName: [Not Using]
    ungapped: [Not Using]
    unconstrained: [Not Using]
    space: [Color Space]
    startReadNum: 1
    endReadNum: 2147483647
    offsetLength: 20
    maxNumMatches: 384
    avgMismatchQuality: 10
    numThreads: 1
    queueLength: 10000
    timing: [Not Using]
    ************************************************************
    ************************************************************
    Reading in reference genome from chr_all.fa.nt.brg.
    In total read 25 contigs for a total of 3095693983 bases
    ************************************************************
    ************************************************************
    Reading match file from 03C14605A_bfastMatches_hg19.bmf.
    ************************************************************
    Performing alignment...
    Reads processed: 0************************************************************
    In function "AlignColorSpaceGappedConstrained": Fatal Error[OutOfRange]. Message: read and reference did not match.
    ***** Exiting due to errors *****
    ************************************************************


    Thank you in advance for your help.

    Nimmi

  • #2
    Can you show the files from within the reference directory as well as the exact commands you run?

    Is it possible you forgot to use -A1 to tell bfast to compute the local alignments in CS?
    -drd

    Comment


    • #3
      Here are the list of files in my directory:

      03C14605A.fastq
      chr_all.fa
      chr_all.fa.cs.brg
      chr_all.fa.cs.1.1.bif
      chr_all.fa.nt.brg
      03C14605A_bfastMatches_hg19.bmf

      Here's the command I used for local alignment:

      $/usr/local/bfast/bin/bfast localalign -f chr_all.fa -m 03C14605A_bfastMatches_hg19.bmf -A 1
      > 03C14605A_bfastMatches_hg19_localalign.baf

      Thank you very much

      Comment


      • #4
        Looks like the fastq is not in the proper format. You need to extract the color sequence and color qualities from the SAM file in the CS/CQ optional tags.

        Comment


        • #5
          Thanks Nils for your response. Here's a few lines from my *.sam file:

          1462_1719_1204_F3 0 chr1 10006 0 50M * 0 0 CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCT
          BABBBB@ABB=BAAB@<BBB;@=BB?;A@B><7=7B>;>@9B@<97+B84 NH:i:508 CS:Z:T22301002301002301002301002301002301002301002301002
          1649_965_405_F3 16 chr1 10014 0 50M * 0 0 AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA +9==%'4587'1+:7@28957<5=7;6@2:=BAB:<?@AB<B<AA?BB@@ NH:i:511 CS:Z:T00320010320010320010320010320010320010320010300010
          1433_953_927_F3 16 chr1 10026 0 50M * 0 0 AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA '%:7+(6-;>%:4(<<8:=-:@268%>>644)::7>9;>B@@2?@A@B?A NH:i:511 CS:Z:T00320010320010320010320010320010320010320010320010
          1326_1211_1996_F3 0 chr1 10029 0 50M * 0 0 CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC
          <96BA6>7<?>/;6<?1482:41-)+1+466-+4752/6/8!;11/-)>1 NH:i:517 CS:Z:T20230100230100230100230100230100230100230100230100
          1379_1501_1111_F3 16 chr1 10047 0 50M * 0 0 CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC
          :%6=%!-2+7!%==<)!:1:4+)--8B6914:>1:/46:+;=:=@<</'> NH:i:517 CS:Z:T10010320010320010320010320010320010320010320010320
          1608_1356_225_F3 0 chr1 10048 0 50M * 0 0 CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCT
          ?>B>@@:AB9>=?><@6B?8<><B;<;;<B</<?4@:5929B56/2)@@) NH:i:508 CS:Z:T22301002301002301002301002301002301002301002301002

          Comment


          • #6
            If you have fragment reads, then the following perl code should do it. Pipe the SAM file into it (or use "samtools view <in.bam>" for a BAM file). For paired end or multi-end reads, it does get trickier and the below will not keep the pairings. Note that your input SAM file does not have the CQ tag, and thus color qualities are lost.


            Code:
            #!/bin/perl
            use strict;
            use warnings;
            while(defined(my $line = <STDIN>)) {
                if($line =~ m/^@/) {
                    next;
                }
                chomp($line);
                my $NAME = $line;
                my $CS = $line;
                my $CQ = "";
                $NAME =~ s/^([^\t]+)\t.*/$1/;
                $CS =~ s/.*CS:Z:([^\t]+).*/$1/;
                $CS =~ tr/ /4/;
                if($line =~ m/CQ:Z:([^\t]+)/) {
                    $CQ = $1;
                }   
                else {
                    $CQ = "!" x (length($CS)-1); 
                }   
                print STDOUT "\@$NAME\n$CS\n+\n$CQ\n";
            }
            Nils

            Comment


            • #7
              Thanks Nils for the perl script.

              Just to summarize........I converted *.sam file to *.fastq using PICARD and this fastq file was in nucleotide space. When I ran localalign in bfast I got the "AlignColorSpaceGappedConstrained" error. Then I used perl script posted by Nils to convert *.sam to *.fastq in colorspace. Then re-ran the match command (*.bmf) followed by localalign and everything works fine!

              Thank you all for your suggestions and responses

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              9 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              67 views
              0 likes
              Last Post seqadmin  
              Working...
              X