Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Warning: Encountered reference sequence with only gaps

    Does anyone know what this output in tophat mean? How much of a problem is it?

  • #2
    I am curious too...

    I am curious as well as to what this truly means and if it should be a concern. I encountered it on my latest runs in which I am using the Ensembl version of the human genome assembly. An example of the warning displayed is:

    [Sun Aug 08 23:20:23 2010] Beginning TopHat run (v1.0.14)
    -----------------------------------------------
    [Sun Aug 08 23:20:23 2010] Preparing output location brain/
    [Sun Aug 08 23:20:23 2010] Checking for Bowtie index files
    [Sun Aug 08 23:20:23 2010] Checking for reference FASTA file
    [Sun Aug 08 23:20:23 2010] Checking for Bowtie
    Bowtie version: 0.12.5.0
    [Sun Aug 08 23:20:23 2010] Checking reads
    seed length: 36bp
    format: fastq
    quality scale: phred33 (default)
    [Sun Aug 08 23:23:07 2010] Mapping reads against Homo_sapiens.GRCh37.59 with Bowtie
    [Sun Aug 08 23:26:43 2010] Joining segment hits
    [Sun Aug 08 23:27:09 2010] Mapping reads against Homo_sapiens.GRCh37.59 with Bowtie
    [Sun Aug 08 23:31:02 2010] Joining segment hits
    [Sun Aug 08 23:31:26 2010] Searching for junctions via segment mapping
    [Sun Aug 08 23:46:58 2010] Retrieving sequences for splices
    [Sun Aug 08 23:52:48 2010] Indexing splices
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    [Sun Aug 08 23:54:58 2010] Mapping reads against segment_juncs with Bowtie
    [Sun Aug 08 23:58:35 2010] Joining segment hits
    [Sun Aug 08 23:59:00 2010] Mapping reads against segment_juncs with Bowtie
    [Mon Aug 09 00:02:49 2010] Joining segment hits
    [Mon Aug 09 00:03:13 2010] Reporting output tracks
    -----------------------------------------------
    Run complete [00:45:40 elapsed]
    Should we be concerned?

    -steve

    Comment


    • #3
      Did you find the answer to this warning message? I have similar problem with my data as well and appreciate if you can let me know how to deal with it.

      Comment


      • #4
        I never did find an answer to this. I didn't see any problems associated it with it either though.

        Comment


        • #5
          -------bump-------

          I also get this using Ensembl human reference (v. 68)

          Comment


          • #6
            Is it because you are using a repeat masked genome and possibly some of the sequences have been completely masked? I've seen this in other, less complete genomes, so I'm unsure if this would be the case with the human genome.

            Comment


            • #7
              Thanks Wallysb01, that's propably it. I am using a soft-masked reference (low complexity regions are lowercase).

              Comment


              • #8
                I also have the same error message, when I use the Homo_sapiens.GRCh37.71.dna.toplevel reference.
                Does this upper/lower case issue influence the mapping/counting outcome?

                Comment


                • #9
                  I also have the same warning message with Homo_sapiens.GRCh37.71.dna.toplevel reference. anyone can explain it?

                  Comment


                  • #10
                    Originally posted by digitonin View Post
                    I never did find an answer to this. I didn't see any problems associated it with it either though.
                    -----------BUMP-----------

                    First of all, sorry for bumping such a old post, but this problem still persists...
                    Does anyone found a reason why does this happens?
                    I have obtained this same warning message when using the current Drosophila melanogaster reference (r6.03; BDGP5.78)

                    Comment


                    • #11
                      Hugo,

                      This probably has to do with the lowercase or the use of N's in the reference. Either way, it is not a problem.

                      Comment


                      • #12
                        Hi,
                        I am using bowtie 1.1.2 and I get the same message -
                        encountered reference sequence with gaps
                        My genome is not repmasked and has the following type of header
                        >ta_IWGSC_CSSassembly_1as_v2_44039

                        Does such a warning interfere with the bowtie-build in anyway?

                        Comment


                        • #13
                          This will happen whenever you have cases like this
                          Code:
                          >chr1
                          >chr2
                          ACAGCTACT
                          or this

                          Code:
                          >chr3
                          NNNNNNN
                          >chr4
                          ACGTAGCTGACT
                          It's just a warning, which means that there won't be an issue with index creation, but you should really fix your fasta files.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM
                          • seqadmin
                            Techniques and Challenges in Conservation Genomics
                            by seqadmin



                            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                            Avian Conservation
                            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                            03-08-2024, 10:41 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Yesterday, 06:37 PM
                          0 responses
                          8 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, Yesterday, 06:07 PM
                          0 responses
                          8 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-22-2024, 10:03 AM
                          0 responses
                          49 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-21-2024, 07:32 AM
                          0 responses
                          67 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X