Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • javijevi
    Member
    • Jan 2010
    • 38

    BFAST error in FindMatchesInIndexSet function

    Hi all,

    I successfully went along the first steps of BFAST pipeline, including the indexes creation, but got the below copied error when running 'bfast match' step with the following command for a fastq test file with 9 reads:

    bfast match -f reference_genome.fa -A 1 -r test.fastq -i 1 -I 2-10 1> matches.bmf 2> match.log &

    Contents of match.log:
    (...)
    Searching index file 1/1 (index #1, bin #1) complete...
    Found 4 matches.
    Found matches for 4 reads.
    Copying unmatched reads for secondary index search.
    Splitting unmatched reads into temp files.
    bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.
    Splitting unmatched reads into temp files.
    bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.

    Any idea?

    Thanks in advance.
  • javijevi
    Member
    • Jan 2010
    • 38

    #2
    Originally posted by javijevi View Post
    Splitting unmatched reads into temp files.
    bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.
    Splitting unmatched reads into temp files.
    bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.
    Just to tell that I made a mistake in copying twice the last two lines of the output.

    Comment

    • nilshomer
      Nils Homer
      • Nov 2008
      • 1283

      #3
      Originally posted by javijevi View Post
      Hi all,

      I successfully went along the first steps of BFAST pipeline, including the indexes creation, but got the below copied error when running 'bfast match' step with the following command for a fastq test file with 9 reads:

      bfast match -f reference_genome.fa -A 1 -r test.fastq -i 1 -I 2-10 1> matches.bmf 2> match.log &

      Contents of match.log:
      (...)
      Searching index file 1/1 (index #1, bin #1) complete...
      Found 4 matches.
      Found matches for 4 reads.
      Copying unmatched reads for secondary index search.
      Splitting unmatched reads into temp files.
      bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.
      Splitting unmatched reads into temp files.
      bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.

      Any idea?

      Thanks in advance.
      Any reason why you want to use secondary indexes? I would recommend using all the indexes in the primary search (no secondary indexes).

      This may be a bug (with the secondary search). Please submit your report to [email protected] so we can resolve the issue quickly.

      Comment

      • nilshomer
        Nils Homer
        • Nov 2008
        • 1283

        #4
        Originally posted by nilshomer View Post
        Any reason why you want to use secondary indexes? I would recommend using all the indexes in the primary search (no secondary indexes).

        This may be a bug (with the secondary search). Please submit your report to [email protected] so we can resolve the issue quickly.
        I have found the bug and fixed the latest source code available via GIT. Let me know if you have any problems: )

        Comment

        • javijevi
          Member
          • Jan 2010
          • 38

          #5
          Originally posted by nilshomer View Post
          Any reason why you want to use secondary indexes? I would recommend using all the indexes in the primary search (no secondary indexes).
          In BFAST book, you can find the following: 'If you wish to have a secondary set of indexes, which are used if no matches are found in the main set of indexes, use the -I option'. So, I thought that it was more efficient to not use a mismatch-allowing index, e.g., 1110111110011111, for reads which were already mapped by using an all-matchs index, that is, 11111111111111.

          Obviously, I missed something important in this issue because of the complexity of the index-based search algorithm for a biologist, and I therefore will blindly follow your recommendation about not using secondary indexes.

          Comment

          • nilshomer
            Nils Homer
            • Nov 2008
            • 1283

            #6
            Originally posted by javijevi View Post
            In BFAST book, you can find the following: 'If you wish to have a secondary set of indexes, which are used if no matches are found in the main set of indexes, use the -I option'. So, I thought that it was more efficient to not use a mismatch-allowing index, e.g., 1110111110011111, for reads which were already mapped by using an all-matchs index, that is, 11111111111111.

            Obviously, I missed something important in this issue because of the complexity of the index-based search algorithm for a biologist, and I therefore will blindly follow your recommendation about not using secondary indexes.
            I have spent a lot of time thinking about the indexing strategy and I would follow the strategy found in section 7.1 where we use 10 "main" indexes and no secondary indexes.

            I apologize for the confusion but I tried to keep options for flexibility.

            Comment

            Latest Articles

            Collapse

            • SEQadmin2
              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by SEQadmin2


              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

              Here are nine questions we think about, in roughly the order they matter, before...
              06-18-2026, 07:11 AM
            • SEQadmin2
              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
              by SEQadmin2


              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
              ...
              06-02-2026, 10:05 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, 06-17-2026, 06:09 AM
            0 responses
            41 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-09-2026, 11:58 AM
            0 responses
            102 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            123 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-04-2026, 08:59 AM
            0 responses
            114 views
            0 reactions
            Last Post SEQadmin2  
            Working...