Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • [bam_parse_region] fail to determine the sequence name

    I'm trying to run differential expression using EdgeR and/or DEseq2 on the Ratsch Lab galaxy server but I keep getting this error:



    [bam_parse_region] fail to determine the sequence name


    I mapped the groomed and filtered FastQ files with Tophat2 using the mm10 reference and did DE expression with the Tophat2 bam files and the UCSC genes.gtf files.



    Anyone know what could be the problem?

  • #2
    At what point did it give you that error? There's actually no reason that that function within samtools should ever be used for computing differential expression (at least none that I can think of).

    Comment


    • #3
      So from the log of EdgeR (but also DESeq2) I can see that it gets up to three steps:

      % 1. Data preparation %

      % 2. Read counting %

      % 3. Differential testing %

      and then it gives an error at step 3.

      In addition to the [bam_parse_region] error as follows:

      [bam_parse_region] fail to determine the sequence name.
      Invalid region chrY_random:54420149-54423069
      R script execution failed

      It also says the following at the end of the log file:

      Error in `row.names<-.data.frame`(`*tmp*`, value = value) :
      duplicate 'row.names' are not allowed
      Calls: rownames<- -> row.names<- -> row.names<-.data.frame
      In addition: Warning message:
      non-unique values when setting 'row.names': ‘4933409K07Rik’, ‘A430089I19Rik’, ‘AU018829’, ‘AY761185’, ‘BC002163’, ‘BC061212’, ‘Ccl21b’, ‘Ccl21c’, ‘Cngb1’, ‘E330014E10Rik’, ‘Gm10591’, ‘Gm13298’, ‘Gm13304’, ‘Gm13308’, ‘Gm15056’, ‘Gm15085’, ‘Gm15093’, ‘Gm16367’, ‘Gm1993’, ‘Gm3286’, ‘Gm5506’, ‘Gm5512’, ‘Gm5643’, ‘Gm5801’, ‘Gm6040’, ‘Gm6367’, ‘Mir138-2’, ‘Mir1906-1’, ‘Mir1906-2’, ‘Mir684-1’, ‘Obox2’, ‘Ott’, ‘Snord58b’, ‘Sult1c1’, ‘Tagap’, ‘Tff1’, ‘Tsnax’, ‘Ube1y1’, ‘Vmn1r186’, ‘Vmn1r187’, ‘Vmn1r62’, ‘Vmn1r63’
      Execution halted

      It's driving me crazy.

      Thanks for your help!

      Originally posted by dpryan View Post
      At what point did it give you that error? There's actually no reason that that function within samtools should ever be used for computing differential expression (at least none that I can think of).

      Comment


      • #4
        Did you write that galaxy pipeline or did someone else?

        Comment


        • #5
          Someone else. I use it on https://galaxy.cbio.mskcc.org/ under Differential/Quantitative Analysis.

          I used Tophat2 on the same server, downloaded igenomes UCSC genes.gtf and ran DESeq2 and EdgeR with the generated bam files and the downloaded gtf.

          Comment


          • #6
            Short answer: Whoever wrote that pipeline didn't know what they were doing.

            Longer answer: It looks like you aligned to a genome that lacked chrY_random, whereas the annotation file has that. That's what leads to the "[bam_parse_region]" error. Having said that, that shouldn't occur because I expect this pipeline is performing the counting incorrectly. This is also what's leading to the error at the end when it's trying to create a dataframe. Basically, don't use that pipeline. You have three options for moving forward. (1) Use a different galaxy pipeline. You already have the BAM files, so this is probably doable. (2) Download the aforementioned BAM files and do things correctly locally (this would require you to know how to analyze your dataset). (3) Collaborate with a local bioinformatician. This is the best idea, since if you're using galaxy you're probably new to this. Galaxy is convenient, but if you're new to things then it's a black box that just spits out results that may, or may not, be correct (and you'd likely have no way of knowing which).

            Comment


            • #7
              Thank you! That makes sense.

              Yeah, I really have no idea of analysis so I agree my best bet is to work with one of our bioinformaticians and get it sorted.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              9 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              67 views
              0 likes
              Last Post seqadmin  
              Working...
              X