Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • gffread segmentation fault

    Hello All,

    I donno what is going wrong with the gffread utility from cufflinks.

    This is the command I have been using

    gffread -w /path/test.fa -g /path/hg19_ucsc.fa /path/test.gtf
    My GTF file is just 4MB with around 48K records. The output fasta file prints only around 42K records with a size of 480MB and then I see this error.

    /path: line 16: 28068 Segmentation fault (core dumped) gffread -w /path/test.fa -g /path/hg19_ucsc.fa /path/test.gtf
    I have another GTF file of size 12MB which is being used to generate a fasta file of size 3.7GB and then it generates the same segmentation error.

    All helps are appreciated.

  • #2
    Fragment your gtf file into five separate chunks and run each with gffread. it may help you unserstand the core of the fault.

    Comment


    • #3
      In the third file, I am getting the segmentation fault error. Could you please help me what it means?

      Comment


      • #4
        Originally posted by gokhulkrishnakilaru View Post
        In the third file, I am getting the segmentation fault error. Could you please help me what it means?
        A program is represented as segments in a computer's memory. A segmentation fault means that the program tried to access an address that is not located in one of its segments.

        Usually, this means that there is a bug in the software.

        Comment


        • #5
          Originally posted by seb567 View Post
          A program is represented as segments in a computer's memory. A segmentation fault means that the program tried to access an address that is not located in one of its segments.

          Usually, this means that there is a bug in the software.
          I broke my large file into chunks of 10000 lines each which left me with 10 files. The program runs fine for 9 files and throws this error for the last file.

          Any thoughts?

          Comment


          • #6
            Originally posted by gokhulkrishnakilaru View Post
            I broke my large file into chunks of 10000 lines each which left me with 10 files. The program runs fine for 9 files and throws this error for the last file.

            Any thoughts?
            Is there something unusual in the last compared to the nine others ?

            Comment


            • #7
              Originally posted by seb567 View Post
              Is there something unusual in the last compared to the nine others ?
              I am trying to figure out those things. I broke the last file into 1000 line chunks now. And all of them are giving a segmentation fault. I cross checked the end coordinate against the chrom sizes files and all of them seem to be within the limit.

              I don't know whats wrong with it.

              Comment


              • #8
                I have had the same problem with the filtering parameters of gffread (-J, -V -H etc). Can anyone suggest another program that does a similar thing to gffread? That is, filter transcripts based on CDS features and provide a multi-fasta format sequence file at the end?

                Comment


                • #9
                  Hello all,

                  I had the same problem a while ago. I fragment my file into 10 and I got errors in two of them. In both cases I broke the files into smaller ones and (after more splitting) I found located the problem in two a gene with alternative splicing event. In my case the secondary transcript was bigger than the annotated gene sequence (the last exon coordinates were placed outside the gene).

                  I just erased the two features from my original file (I considered that it was not a great loss for my purposes) and it works fine now.

                  Hope it helps,

                  Pablo

                  Comment


                  • #10
                    Hi,
                    Thanks for your reply Pablo, I tried doing what you said and I broke down my files and kept getting the segfault even when the file was only 100 lines long(!). I think maybe I have a lot of alternate splicing in my organism (a basidiomycete (fungi)). Does anyone know of a program like gffread that can handle alternate splicing? Or another way I could get around this problem?

                    Thanks very much

                    Will

                    Comment


                    • #11
                      I'm having the same problem as Will. I've done the tophat2-cufflinks-cuffmerge pipeline to generate a merged GTF file. However, my organism has fairly high gene density, so cufflinks is predicting very long transcripts, which are not correct. I wanted to filter the merged GTF file using gffread to discard any transcripts that have internal stops (either the -V or -J parameter). However, I keep getting the 'segmentation_fault' error. I have tried to break up the merged GTF file into smaller sizes (such as 1000 lines), however the segmentation error persists. Does anybody know a solution to this problem?

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Advancing Precision Medicine for Rare Diseases in Children
                        by seqadmin




                        Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                        12-16-2024, 07:57 AM
                      • seqadmin
                        Recent Advances in Sequencing Technologies
                        by seqadmin



                        Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                        Long-Read Sequencing
                        Long-read sequencing has seen remarkable advancements,...
                        12-02-2024, 01:49 PM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 12-17-2024, 10:28 AM
                      0 responses
                      22 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 12-13-2024, 08:24 AM
                      0 responses
                      42 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 12-12-2024, 07:41 AM
                      0 responses
                      28 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 12-11-2024, 07:45 AM
                      0 responses
                      42 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X