Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TranAbyss-Analyze Error

    Although the paired reads were checked with two different scripts for unpaired reads before running TransAbyss-Analyze, an error saying "the paired-end accessions do not match" is output before aborting. The message essentially says “Paired-end accessions FCC6B7TACXX:7:1101:10524:57301# and FCC6B7TACXX:7:1101:9245:1994# do not match”.

    Command:
    transabyss-analyze -a TrAb_Merged_1_1.fa -1 RG11_NR_P3_R1.fq.gz -2 RG11_NR_P3_R2.fq.gz --SS --ref Gh1 --cfg /work/satishg/transcriptome.cfg --annodir /work/satishg/Gh1.gtf --analyze fusion -o /work/satishg/TrAn_RG11_Gh1 -t 20

    Error Message:
    /work/satishg/TrAn_RG11_Gh1/reads_to_genome/cluster/transabyss.local.sh: line 3: ulimit: core file size: cannot modify limit: Operation not permitted
    GSNAP version 2014-12-29 called with args: gsnap --gunzip -d Gh1 -D /work/satishg -t 18 --format sam -N 1 -m 10 /work/satishg/RG11_NR_P3_R1.fq.gz /work/satishg/RG11_NR_P3_R2.fq.gz
    Checking compiler assumptions for popcnt: 6B8B4567 __builtin_clz=1 __builtin_ctz=0 _mm_popcnt_u32=17 __builtin_popcount=17
    Checking compiler assumptions for SSE2: 6B8B4567 327B23C6 xor=59F066A1
    Checking compiler assumptions for SSE4.1: -103 -58 max=-58 => compiler sign extends
    Finished checking compiler assumptions
    Novel splicing (-N) turned on => assume reads are RNA-Seq
    Paired-end accessions FCC6B7TACXX:7:1101:10524:57301# and FCC6B7TACXX:7:1101:9245:1994# do not match
    real 0m0.623s
    user 0m0.009s
    sys 0m0.056s
    ERROR: Execution of script ended with a non-zero exit-status.

    I tried running it with three different read pairs, but all terminate with the same error.

  • #2
    Have you done something to the original raw data (e.g. trimming) that could potentially have broken the read pairing?

    Have you checked to see if the reads the program is complaining about are present/have no problems.

    Code:
    $ zgrep -A 3 7:1101:10524:57301 your files(R1/R2)

    Comment


    • #3
      They show up:

      $ zgrep -A 3 7:1101:10524:57301 RG11_NR_P3_R1.fq.gz
      @FCC6B7TACXX:7:1101:10524:57301#/1
      CCTCATGGATACCAAGCTTGAGGTTCTTTGAGAATGCCTCATAAAACTTGTTGTAATCTTCCTTGTTCTCTGCTATTTCAAAGAAGAG
      +
      giiiiihiiihhiihiihiihiideghiiihihiiifghighhhhhhiihhiafhiihhhhiibgggggedgeeeeeebdddddb`bc

      $ zgrep -A 3 7:1101:9245:1994 RG11_NR_P3_R2.fq.gz
      @FCC6B7TACXX:7:1101:9245:1994#/2
      ACAAGACTCGGCCGCTTAAAAAAACCAGGGTGAAAGCCATGCCTTTCGTTAAAGCTCAAAAGACCAAGGCTTATTTCAAGAGATATCA
      +
      gihiiiiiiiiiiiiiiiiiiiiiiiiiiibggggeeeaedcddddccccccccccccbcccccccccccccccddddccbccccdcd

      The reads were trimmed and cleaned for rRNA sequences.

      Comment


      • #4
        Those are not reads from the same fragment (unless you are just showing examples from separate R1 and R2 files).

        Does this show a corresponding read from R2 file?

        Code:
        $ zgrep -A 3 7:1101:10524:57301 RG11_NR_P3_R2.fq.gz

        Comment


        • #5
          It does not return anything. What is that supposed to mean?

          Comment


          • #6
            That means the corresponding read was eliminated from R2 file during trimming leaving you with an unmatched read in R1. That is why transabyss is now complaining.

            Use repair.sh from BBMap to remove reads that are singletons.

            Code:
            $ repair.sh in1=r1.fq in2=r2.fq out1=fixed1.fq out2=fixed2.fq outsingle=singletons.fq
            In future you may want to use a trimming program that is paired-end aware (or you should have trimmed the R1/R2 files together) to keep the read pairing intact in R1/R2 files.

            Comment


            • #7
              Thanks ! It seems to be clearing out unpaired reads. I used Trimmomatic to trim the reads, followed by running TransAbyss-Analyze. When I got the unpaired reads error, I processed the paired reads once again through a python script to remove unpaired reads, which didn't find any. I shall run the bbmap output reads through TransAbyss-Analyze and get back if I face any other issues.

              Comment


              • #8
                BBMap also contains bbduk.sh which is paired-end aware trimming program. Find the thread for bbduk to get additional information.

                Comment


                • #9
                  Dear GenoMax,

                  recently I also got the same error when I tried to map my RNA-seq data to the mm9 genome,

                  START time : Wed Jul 8 17:43:56 CEST 2015
                  GSNAP version 2014-12-23 called with args: /usr/local/gmap/gmap-2014-12-23/bin/gsnap -t 8 -N 1 -n 1 -A sam -D /data/DIV5/HumGen/WHY/NEO_genome_alignm$
                  Checking compiler assumptions for popcnt: 6B8B4567 __builtin_clz=1 __builtin_ctz=0 _mm_popcnt_u32=17 __builtin_popcount=17
                  Checking compiler assumptions for SSE2: 6B8B4567 327B23C6 xor=59F066A1
                  Checking compiler assumptions for SSE4.1: -103 -58 max=198 => compiler zero extends
                  Finished checking compiler assumptions
                  Novel splicing (-N) turned on => assume reads are RNA-Seq
                  Paired-end accessions FCC6N7WACXX:8:1101:1142:17629# and FCC6N7WACXX:8:1101:1206:2090# do not match

                  and I tried to grep both ID from both R1 and R2 fastq files, and I can get sequence information from both files.

                  For example:
                  grep -A 3 FCC6N7WACXX:8:1101:1206:2090 FCC6N7WACXX-MOUeoqEAAFRAAPEI-207_L8_1.fq

                  @FCC6N7WACXX:8:1101:1206:2090#/1
                  TCTCCTTCAACAACATCAAACTCCACAGTCTCTCCATCGCCTACACTGCGAAGGTACTTCCTGGGGTTATTCTTCTTTATGGCAGTCTGG
                  +
                  bbbeeeeegggggihihiiiiiihiiiigfhiiiiihiiiiiiiiiiihiiiii_egfhihihgggX[adddddd`bcccccccc[bccc

                  grep -A 3 FCC6N7WACXX:8:1101:1206:2090 FCC6N7WACXX-MOUeoqEAAFRAAPEI-207_L8_2.fq
                  @FCC6N7WACXX:8:1101:1206:2090#/2
                  TTGGGAACAGTCAAATGGTTCAATGTAAGGAACGGATACGGTTTCATCAACAGGAATGACACCAAGGAAGACGTATTTGTACACCAGACT
                  +
                  _bbeeeeeggggbefefhghhhhhifgfihdhiiihhiiiieghihhhigfgfghihihiiiiihiggggeeeccccdcbcceccccccc

                  So do you have any clue about this? Many thanks~!

                  Comment


                  • #10
                    @whytcs: Can you try running repair.sh from BBMap on your files to see if there is a problem somewhere else?

                    Someone had previously reported this error with GSNAP but the context there may not be applicable in your case: http://seqanswers.com/forums/showthread.php?t=45718

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Essential Discoveries and Tools in Epitranscriptomics
                      by seqadmin




                      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                      04-22-2024, 07:01 AM
                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Yesterday, 11:49 AM
                    0 responses
                    13 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-24-2024, 08:47 AM
                    0 responses
                    16 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    61 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    60 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X