Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • bmicro_mit1
    Junior Member
    • Feb 2013
    • 4

    Cufflinks 2.0.2 segmentation fault

    I am using 100 bp paired end Illumina Hi-seq data with about 50M reads and trying to use tophat / cufflinks for RNA-seq analysis for human data, using Ensemble v68 gtf along with Gencode v13 lncRNA gtf annotations. These files were concatenated together to run both tophat 2.0.6 with bowtie2 2.0.6:

    tophat -p 4 --solexa1.3-quals --read-realign-edit-dist 0 --no-novel-juncs --library-type fr-unstranded -G $GTF -o $OUT $GENOME $FASTQ_1 $FASTQ_2

    and Cufflinks 2.0.2
    cufflinks -o $OUT -p 4 -G $GTF -b $FASTA --multi-read-correct $OUT/accepted_hits.bam

    A segmentation fault has continued to occur with multiple samples at similar locations as Cufflinks is re-estimating abundacnes with bias and multi-read correction. Below is the output:

    [09:41:21] Learning bias parameters.
    > Processed 635 loci. [*************************] 100%
    [09:45:58] Re-estimating abundances with bias and multi-read correction.
    > Processing Locus chr16:5289802-6826015 [******* ] 31%Segmentation fault (core dumped)

    Any input would be greatly appreciated.
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    How much RAM do you have on this machine?

    Comment

    • bmicro_mit1
      Junior Member
      • Feb 2013
      • 4

      #3
      This was run on a couple of different machines in a SGE cluster. Some of the nodes had up to 48Gb of RAM, but in the SGE email reporting the program had died, it never reported more than 5Gb of memory usage.

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        Are there any other error messages (stderr output)?

        Comment

        • bmicro_mit1
          Junior Member
          • Feb 2013
          • 4

          #5
          No, just the segmentation fault. I am running with verbose mode right now to see if there is more output.

          Comment

          • GenoMax
            Senior Member
            • Feb 2008
            • 7142

            #6
            Are you using the -o and -e directives with your qsub (or SGE job submission script) to capture the output/stderr output? Contents of those would be useful as well.

            Comment

            • bmicro_mit1
              Junior Member
              • Feb 2013
              • 4

              #7
              Yes, nothing in the 'o' file, and only what I had put in the initial post from the 'e' file.

              Comment

              • chrisjohn86
                Junior Member
                • Mar 2013
                • 2

                #8
                I have had a lot of segmentation faults with tuxedo over the last few weeks. I finally figured out it was due to bad RAM, I removed 8GB of 16GB total and it started working fine. Memoryxp is a good RAM diagnostic tool. This is one possible reason for seg faults. There are others:

                Comment

                • Anelda
                  Member
                  • May 2010
                  • 30

                  #9
                  I have found that the order of the reference chromosomes in the genome.fasta file and the chromosomes in the GFF/GTF file, must be exactly the same otherwise a segmentation fault occurs. This is specifically valid in the case of Cufflinks. To demonstrate..

                  grep ">" genome.fasta > fasta.order
                  cut -f 1 genome.gff | uniq > gff.order

                  diff fasta.order gff.order

                  If the order of the chromosomes are not the same, you'll have to reshuffle. Easiest might be to reshuffle the GFF/GTF - I'm not sure if there are any scripts that can sort fasta/gff files. I just grep each chromosome from the GFF file and send it to a separate file, then cat the individual chromosome.gff files in the correct order and create new genome.gff.

                  Hope this helps someone!

                  Comment

                  • sterding
                    Member
                    • Sep 2010
                    • 36

                    #10
                    Originally posted by Anelda View Post
                    I have found that the order of the reference chromosomes in the genome.fasta file and the chromosomes in the GFF/GTF file, must be exactly the same otherwise a segmentation fault occurs.
                    Thanks. I am testing this. I am also curious how you found the trick

                    Comment

                    • sterding
                      Member
                      • Sep 2010
                      • 36

                      #11
                      hi Anelda,

                      I made the genome.fa and gtf file in the same order, but still I got the " Segmentation fault" error in the step of " Learning bias parameters" if I use -b option. Without the "-b" option, I don't get the error. So, I think the bug is in the "-b" option. Hopefully cufflinks team can get attention to the problem.

                      Comment

                      • biocomputer
                        Member
                        • Dec 2013
                        • 62

                        #12
                        Originally posted by sterding View Post
                        hi Anelda,

                        I made the genome.fa and gtf file in the same order, but still I got the " Segmentation fault" error in the step of " Learning bias parameters" if I use -b option. Without the "-b" option, I don't get the error. So, I think the bug is in the "-b" option. Hopefully cufflinks team can get attention to the problem.
                        I'm using Cufflinks 2.2.1 and having the same problem with cufflinks and cuffdiff. -b causes a segfault, it works fine without it. I ensured the genome.fa and gtf file have their chromosomes in the same order and contain the same chromosomes and there is lots of free memory available.

                        Comment

                        • offspring
                          Member
                          • Mar 2013
                          • 32

                          #13
                          Please file an issue report at https://github.com/cole-trapnell-lab/cufflinks containing a description of the problem and how to reproduce it, otherwise the cufflinks team won't even be aware of the problem.

                          Comment

                          • biocomputer
                            Member
                            • Dec 2013
                            • 62

                            #14
                            I was able to solve my problem. Despite the "-b genome.fa" seemingly being the cause of the problem it's actually the .gtf file. See here how to modify the .gtf file:



                            Last edited by biocomputer; 01-21-2015, 12:28 PM.

                            Comment

                            Latest Articles

                            Collapse

                            • GATTACAT
                              Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                              by GATTACAT
                              Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                              07-01-2026, 11:43 AM
                            • SEQadmin2
                              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                              by SEQadmin2


                              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                              Here are nine questions we think about, in roughly the order they matter, before...
                              06-18-2026, 07:11 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by SEQadmin2, Yesterday, 11:08 AM
                            0 responses
                            6 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-30-2026, 05:37 AM
                            0 responses
                            11 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-26-2026, 11:10 AM
                            0 responses
                            19 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-17-2026, 06:09 AM
                            0 responses
                            53 views
                            0 reactions
                            Last Post SEQadmin2  
                            Working...