Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BFAST error in FindMatchesInIndexSet function

    Hi all,

    I successfully went along the first steps of BFAST pipeline, including the indexes creation, but got the below copied error when running 'bfast match' step with the following command for a fastq test file with 9 reads:

    bfast match -f reference_genome.fa -A 1 -r test.fastq -i 1 -I 2-10 1> matches.bmf 2> match.log &

    Contents of match.log:
    (...)
    Searching index file 1/1 (index #1, bin #1) complete...
    Found 4 matches.
    Found matches for 4 reads.
    Copying unmatched reads for secondary index search.
    Splitting unmatched reads into temp files.
    bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.
    Splitting unmatched reads into temp files.
    bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.

    Any idea?

    Thanks in advance.

  • #2
    Originally posted by javijevi View Post
    Splitting unmatched reads into temp files.
    bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.
    Splitting unmatched reads into temp files.
    bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.
    Just to tell that I made a mistake in copying twice the last two lines of the output.

    Comment


    • #3
      Originally posted by javijevi View Post
      Hi all,

      I successfully went along the first steps of BFAST pipeline, including the indexes creation, but got the below copied error when running 'bfast match' step with the following command for a fastq test file with 9 reads:

      bfast match -f reference_genome.fa -A 1 -r test.fastq -i 1 -I 2-10 1> matches.bmf 2> match.log &

      Contents of match.log:
      (...)
      Searching index file 1/1 (index #1, bin #1) complete...
      Found 4 matches.
      Found matches for 4 reads.
      Copying unmatched reads for secondary index search.
      Splitting unmatched reads into temp files.
      bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.
      Splitting unmatched reads into temp files.
      bfast: RunMatch.c:718: FindMatchesInIndexSet: Assertion `numReads == numWritten' failed.

      Any idea?

      Thanks in advance.
      Any reason why you want to use secondary indexes? I would recommend using all the indexes in the primary search (no secondary indexes).

      This may be a bug (with the secondary search). Please submit your report to [email protected] so we can resolve the issue quickly.

      Comment


      • #4
        Originally posted by nilshomer View Post
        Any reason why you want to use secondary indexes? I would recommend using all the indexes in the primary search (no secondary indexes).

        This may be a bug (with the secondary search). Please submit your report to [email protected] so we can resolve the issue quickly.
        I have found the bug and fixed the latest source code available via GIT. Let me know if you have any problems: )

        Comment


        • #5
          Originally posted by nilshomer View Post
          Any reason why you want to use secondary indexes? I would recommend using all the indexes in the primary search (no secondary indexes).
          In BFAST book, you can find the following: 'If you wish to have a secondary set of indexes, which are used if no matches are found in the main set of indexes, use the -I option'. So, I thought that it was more efficient to not use a mismatch-allowing index, e.g., 1110111110011111, for reads which were already mapped by using an all-matchs index, that is, 11111111111111.

          Obviously, I missed something important in this issue because of the complexity of the index-based search algorithm for a biologist, and I therefore will blindly follow your recommendation about not using secondary indexes.

          Comment


          • #6
            Originally posted by javijevi View Post
            In BFAST book, you can find the following: 'If you wish to have a secondary set of indexes, which are used if no matches are found in the main set of indexes, use the -I option'. So, I thought that it was more efficient to not use a mismatch-allowing index, e.g., 1110111110011111, for reads which were already mapped by using an all-matchs index, that is, 11111111111111.

            Obviously, I missed something important in this issue because of the complexity of the index-based search algorithm for a biologist, and I therefore will blindly follow your recommendation about not using secondary indexes.
            I have spent a lot of time thinking about the indexing strategy and I would follow the strategy found in section 7.1 where we use 10 "main" indexes and no secondary indexes.

            I apologize for the confusion but I tried to keep options for flexibility.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 08:47 AM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            59 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            54 views
            0 likes
            Last Post seqadmin  
            Working...
            X