Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    BFAST - Alignment for ABI or Illumina sequencing - with qualities

    I wanted to let the community know about a new release of BFAST that now supports reads with quality scores.



    It is designed to handle Illumina and ABI SOLiD data on the human whole-genome resequencing scale (billions of reads). It is multi-threaded and can be easily be parallelized on a cluster, or on your local desktop. You can easily tune it to handle large insertions or deletions (>10bp) in your alignment, SNPs, and even appropriately mapping ABI SOLiD color errors, all while maintaining speed and accuracy.
  • valeu
    Member
    • Sep 2008
    • 69

    #2
    Dear Nils,

    Can I run BFAST directly on GAII fastq data? They look like this:

    @HWUSI-EAS454:5:1:0:149
    ATTTCTCCACCTCCTCNCCCACCCCTTTTTTTTCCTTACTTCTTACTAAT
    +HWUSI-EAS454:5:1:0:149
    abaabaaaaab`baa]D\a]D]bbaaaZa][``aa_a]_ba_aa_aa``_
    @HWUSI-EAS454:5:1:0:314
    AGCCAGATCCTTACCCNCTCCACCTCTTTTTCTGTGTTTTATTTATGGTG
    +HWUSI-EAS454:5:1:0:314
    a_aaZS]Za__NDV_^DOYYMDZVYaY]Y___]ZWZZZ]QPNUNU^^QQR

    Thanks in advance,

    Valentina

    Comment

    • valeu
      Member
      • Sep 2008
      • 69

      #3
      Dear Nils,

      While waiting for your answer, I run BFAST on a very small subset of my mate-pair data without changing the format. I took only 42 reads (21 mate-pairs).

      These are Illumina data and they should be aligned to the genome like this:
      <-(50bp)--(3000pb)--(50bp)->

      This means that the left read should be align to the Crick strand and the right one to the Watson strand with a 3000bp spacer between them.

      So I run "bfast match" and then "bfast localalign".

      .bmf file was successfully created:

      In total, found matches for 20 out of 21 reads.
      ************************************************************
      Terminating successfully!
      ************************************************************

      but localalign reported an error (parameters "-l 3000 -L 3"):

      Performing alignment...
      Currently on:
      thread:1 [0]Assertion failed: 0 <= endRowStepOne && 0 <= endColStepOne, file AlignNTSpace.c, line 192
      Abort (core dumped)

      without "-L" I don't get an error:

      Outputted alignments for 20 reads.
      Outputted 1 reads for which there were no alignments.
      Outputting complete.
      ************************************************************
      Terminating successfully!
      ************************************************************

      So my question is why I can get such an error, and if the parameter "-L 3" I use in localalign is what I need for my mate-pairs?

      And also "-f" option, is it for mirroring or for fasta file?

      Thank you in advance,

      Valentina

      Comment

      • valeu
        Member
        • Sep 2008
        • 69

        #4
        Dear Nils,

        Sorry to bother you again,

        I have a question about the output format of BFAST (option "-O" in postprocess).

        -O 1 gives almost all information I need but it is not a standard format.. So I would prefer to have output in gff or sam. But
        -O 2 does not give me the information about chromosome
        -O 3 does not give the information about strand...

        And also it would be great to have the information about positions of mismatches and indels! Now I don't see how I can get it...

        Thank you,

        Valentina

        Comment

        • nilshomer
          Nils Homer
          • Nov 2008
          • 1283

          #5
          Originally posted by valeu View Post
          Dear Nils,

          Can I run BFAST directly on GAII fastq data? They look like this:

          @HWUSI-EAS454:5:1:0:149
          ATTTCTCCACCTCCTCNCCCACCCCTTTTTTTTCCTTACTTCTTACTAAT
          +HWUSI-EAS454:5:1:0:149
          abaabaaaaab`baa]D\a]D]bbaaaZa][``aa_a]_ba_aa_aa``_
          @HWUSI-EAS454:5:1:0:314
          AGCCAGATCCTTACCCNCTCCACCTCTTTTTCTGTGTTTTATTTATGGTG
          +HWUSI-EAS454:5:1:0:314
          a_aaZS]Za__NDV_^DOYYMDZVYaY]Y___]ZWZZZ]QPNUNU^^QQR

          Thanks in advance,

          Valentina
          That should work, although the base qualities may not be scaled properly.

          Originally posted by valeu View Post
          Dear Nils,

          While waiting for your answer, I run BFAST on a very small subset of my mate-pair data without changing the format. I took only 42 reads (21 mate-pairs).

          These are Illumina data and they should be aligned to the genome like this:
          <-(50bp)--(3000pb)--(50bp)->

          This means that the left read should be align to the Crick strand and the right one to the Watson strand with a 3000bp spacer between them.

          So I run "bfast match" and then "bfast localalign".

          .bmf file was successfully created:

          In total, found matches for 20 out of 21 reads.
          ************************************************** **********
          Terminating successfully!
          ************************************************** **********

          but localalign reported an error (parameters "-l 3000 -L 3"):

          Performing alignment...
          Currently on:
          thread:1 [0]Assertion failed: 0 <= endRowStepOne && 0 <= endColStepOne, file AlignNTSpace.c, line 192
          Abort (core dumped)

          without "-L" I don't get an error:

          Outputted alignments for 20 reads.
          Outputted 1 reads for which there were no alignments.
          Outputting complete.
          ************************************************** **********
          Terminating successfully!
          ************************************************** **********

          So my question is why I can get such an error, and if the parameter "-L 3" I use in localalign is what I need for my mate-pairs?

          And also "-f" option, is it for mirroring or for fasta file?

          Thank you in advance,

          Valentina
          What version are you using (hopefully bfast.0.6.1c)? The "-f" option is for the fasta filename, the "-F" option is for mirroring (notice the case!).

          Originally posted by valeu View Post
          Dear Nils,

          Sorry to bother you again,

          I have a question about the output format of BFAST (option "-O" in postprocess).

          -O 1 gives almost all information I need but it is not a standard format.. So I would prefer to have output in gff or sam. But
          -O 2 does not give me the information about chromosome
          -O 3 does not give the information about strand...

          And also it would be great to have the information about positions of mismatches and indels! Now I don't see how I can get it...

          Thank you,

          Valentina
          The SAM format has information about strand. See the SAM spec. As for positions of mismatches and indels, use a variant caller or compare the alignment to the reference. It is implicit in the SAM format.

          For more help, consider the BFAST help mailing list ([email protected]).

          Comment

          Latest Articles

          Collapse

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, 06-05-2026, 10:09 AM
          0 responses
          11 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-04-2026, 08:59 AM
          0 responses
          23 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-02-2026, 12:03 PM
          0 responses
          28 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-02-2026, 11:40 AM
          0 responses
          22 views
          0 reactions
          Last Post SEQadmin2  
          Working...