Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • featureCounts segmentation fault

    I have a lot of BAM files (from CCLE database), and I tried counting them using featureCounts.
    Some of the files works great, but many(!) of them featureCounts throws segmentation fault right after the 'Input files : 1 BAM file' line.
    I'm using the same annotation file all the time so I guess that's not the problem, I also tried to decrease the number of threads- didn't change anything..
    Can someone help me? had the same problem?
    my command is:
    featureCounts -p -T 5 -g gene_id -a /home/batel/GTF/combinedDB5.f1.pure.gtf -o fc.txt -b G30584.UM-UC-3.1.bam
    Thank you!

  • #2
    You could have somehow corrupted those BAM files.

    Have you tried to look at the problem BAM files (with samtools view) to see if there are any obvious problems? Do the file sizes look ok?
    Last edited by GenoMax; 02-17-2015, 06:54 AM.

    Comment


    • #3
      If what GenoMax suggested doesn't indicate where the problem is, then try htseq-count. If that runs into problems too then you know with certainty that you have issues with the BAM files. If it doesn't then maybe it's a featureCounts bug (in which case, try subsetting the file until you can find an alignment that causes the problem).

      Comment


      • #4
        I tried cufflinks and it works great..

        that's why I don't think that the problem is at the BAM files..
        Do you think it's the BAM file?

        Comment


        • #5
          If cufflinks worked with the file then presumably it's a bug in featureCounts. It would be good if you could notify the author.

          Comment


          • #6
            Originally posted by batel View Post
            that's why I don't think that the problem is at the BAM files..
            Do you think it's the BAM file?
            Hi @batel,

            You do not need to use '-b' option when you provide BAM format input. featureCounts automatically detects input format for you.

            Please make sure you are using the latest version (1.4.6-p1). '-b' was an option used in old versions.

            If the problem persists after upgrading to the latest version, please provide the complete featureCounts output. This will be helpful for diagnosing the problem.

            Wei

            Comment


            • #7
              Hi all,
              Thanks for yo9ur answers!
              I'm using version v1.4.6, Is it the latest?
              I also tried using Rsubread instead of directly, but it also crashes.
              The command is:
              featureCounts -p -T 5 -g gene_id -a combinedDB5.lincs.f1.pure.gtf -o fc.txt G28059.KMBC-2.1.bam

              The full output is:
              ========== _____ _ _ ____ _____ ______ _____
              ===== / ____| | | | _ \| __ \| ____| /\ | __ \
              ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
              ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
              ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
              ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
              v1.4.6

              //========================== featureCounts setting ===========================\\
              || ||
              || Input files : 1 BAM file ||
              Segmentation fault (core dumped)

              When I used featureCounts via R The command was:
              output <- featureCounts("data/BLCA/0aefd782-d636-4308-b940-63054b37e7b0/G28059.KMBC-2.1.bam", annot.ext="combinedDB5.lincs.f1.pure.gtf", isPairedEnd=TRUE, nthreads=3)

              ========== _____ _ _ ____ _____ ______ _____
              ===== / ____| | | | _ \| __ \| ____| /\ | __ \
              ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
              ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
              ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
              ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
              Rsubread 1.16.1

              //========================== featureCounts setting ===========================\\
              || ||
              || Input files : 1 BAM file ||

              *** caught segfault ***
              address (nil), cause 'unknown'

              Traceback:
              1: .C("R_readSummary_wrapper", as.integer(n), as.character(cmd), PACKAGE = "Rsubread")
              2: featureCounts("/home/batel/CCLEdata/BLCA/0aefd782-d636-4308-b940-63054b37e7b0/G28059.KMBC-2.1.bam", annot.ext = "/home/batel/lncGTF/combinedDB5.lincs.f1.pure.gtf", isPairedEnd = TRUE, nthreads = 3)

              Possible actions:
              1: abort (with core dump, if enabled)
              2: normal R exit
              3: exit R without saving workspace
              4: exit R saving workspace
              Selection: 1
              aborting ...
              Segmentation fault (core dumped)

              Thank you!

              Comment


              • #8
                The latest version is 1.4.6-p1. But I don't think that will make a difference.

                Have you tried @GenoMax's suggestion to check if your bam file is corrupted? You may also try to subset your bam file to try to find offending reads as suggested by @dpryan. Or alternatively, you may convert your bam file to a sam file to see if you will work.

                Wei

                Comment


                • #9
                  FeatureCounts problem

                  Same problem here, I have tried with Subread 1.4.5-p1 and 1.4.6-p5 Linux-x86_64 versions. I tried downloading the BAM file 7 times and I do not think it has corrupted during downloading because everything goes well while downloading from CGhub genetorrent (no errors). Please some body help me in resolving this issue. I have also crosschecked the BAM files with samtools, it works fine.



                  featureCounts -Q 10 -F GTF -a MY_ANNOTATION.gtf -t exon -g gene_id -o mypath/out_counts.txt mypath/celline1.bam

                  ========== _____ _ _ ____ _____ ______ _____
                  ===== / ____| | | | _ \| __ \| ____| /\ | __ \
                  ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
                  ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
                  ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
                  ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
                  v1.4.5-p1

                  //========================== featureCounts setting ===========================\\
                  || ||
                  || Input files : 1 BAM file ||
                  Segmentation fault (core dumped)
                  Last edited by santhilalsubhash; 10-27-2015, 12:04 PM.

                  Comment


                  • #10
                    There seems to be a problem with accessing reads in the BAM file. featureCounts tried to parse the bam file and read in first few reads, but do not know what happened there that caused seg fault. Could you please send us the link to the bam file you downloaded so we can take a close look?

                    Comment


                    • #11
                      Hello Wei Shi,

                      Thanks for your reply. But the problem is these BAM files are not publicly accessible. But if you have access to CGHub you can try downloading BAM file with this analysis id "0ab4dac4-bbb9-4cd5-ae67-9469c6e8f21b" using GeneTorrent. More info on this BAM file
                      Last edited by santhilalsubhash; 10-27-2015, 02:33 PM.

                      Comment


                      • #12
                        OK, could you please just post the first few reads before we try to download the data?

                        Comment


                        • #13
                          samtools view 0ab4dac4-bbb9-4cd5-ae67-9469c6e8f21b/G28064.MDA-MB-468.1.bam | head

                          Code:
                          C1E2NACXX130117:1:1309:3032:16444	419	1	11199	3	101M	=	11938	840	CCGCTTGCTCACGGTGCTGTGCCAGGGCGCCCCCTGCTGGCGACTAGGGCAACTGCAGGGCTCTCTTGCTTAGAGTGGTGGCCAGCGCCCCCTGCTGGGGC	CCCFFFFFHGHHHJEHJIIHIJJEJJJDHIJJJ<DFGGHIGIGGGFFCBCABBCCACCDDDDDDDDCDCDDCDC###########################	CC:Z:15	PG:Z:MarkDuplicates.3A	RG:Z:C1E2N.1	NH:i:2	HI:i:0	NM:i:1	CP:i:102519871	MQ:i:3	UQ:i:2
                          C1E2NACXX130117:1:1309:3032:16444	339	1	11938	3	101M	=	11199	-840	AGACTTCCCGTGTCCTTTCCACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTTCATCTGCAGGTGTCTGACTTCCAGCAACTGCTGGCCTG	344<5@<>@??A@BDCC?;93BCD=BBE=EEDHIGEGEA@;>CEFB<HGD@HDIIIGIIJJJIIJIJIJJJJJIFGIJIHIJIJJJJIHHHGHFFFFFCCC	CC:Z:15	PG:Z:MarkDuplicates.3A	RG:Z:C1E2N.1	NH:i:2	HI:i:0	NM:i:0	CP:i:102519132	MQ:i:3	UQ:i:0
                          C1E2NACXX130117:2:1305:12347:47337	355	1	12040	0	101M	=	12113	174	GCCAGGGTGCAAGCTGAGCACTGGAGTGGAGTTTCCCTGTGGAGAGGAGCCATGCCTAGAGTGGGATGGGCCATTGTTCATCTTCTGGCCCCTGTTGTCTG	CC@FFFFDHHHHGIJJDGCEHIIICEEFGEHGIIGGGGGGGGIHIGIIIGHJJJIJIGIHGEEGHFHHFFDEDCECEDDDDEDDEDDDDDDD?BCCCCCDA	CC:Z:15	PG:Z:MarkDuplicates.2C	RG:Z:C1E2N.2	NH:i:5	HI:i:0	NM:i:1	CP:i:102519030	MQ:i:0	UQ:i:38
                          C1DVPACXX130111:2:2210:2530:60149	163	1	12042	0	101M	=	12082	141	CAGGGTGCAAGCTGAGCACTGGAGTGGAGTTTTCCTGTGGAGAGGAGCCATGCCTAGAGTGGGATGGGCCATTGTTCATCTTCTGGCCCCTGTTGTCTGCA	CCCFFDDFHHHHHIIJJJJJJJGIFGIIJIJJJJJJJHIJIJIJJJJJJJEHIJJIJIJFHIJJJHHHFFFFFDEDEEEEDDEDCCDDDDBDDDDCDDDD#	CC:Z:15	PG:Z:MarkDuplicates.2R	RG:Z:C1DVP.2	NH:i:5	HI:i:0	NM:i:0	CP:i:102519028	MQ:i:0	UQ:i:0
                          D1JYHACXX130117:5:2113:9637:43505	345	1	12048	0	101M	=	12048	0	GCAAGCTGAGCACTGGAGTGGAGTTTTCCTGTGGAGAGGAGCCATGCCTAGAGTGGGATGGGCCATTGTTCATCTTCTGGCCCCTGTTGTCTGCATGTAAC	<<DCC?:>DC@;ACCCA<CCCCACA@@A>?D@CBHGEHHGDIJJIIHIJJJJJIJJJJJJIFHHJJJJIJIIIJIIJIJJJJJJJIJIHFHHHFFFFFC@C	CC:Z:15	PG:Z:MarkDuplicates.33	RG:Z:D1JYH.5	NH:i:6	HI:i:0	NM:i:0	CP:i:102519022	UQ:i:0
                          C1E2NACXX130117:6:1102:7112:41762	165	1	12055	0	*	=	12055	0	CGCGCCTCCGCCGGCGCGCCGCGCCTCTCCGCACCTCTCCGCGCCTCCGCCGGCGCGCCGCCTTTGCGAGGGCGGAGTTGCGTTCTCTTTAGCACACAGCC	@@@DDDDDHFFFHAHGDD>GFG=A9B<@CCBBBBBC?>CC837@B8-?@BBB&07@B57B@B@;CCA7>99;;;95<+9+:-5<BBA>CC3>>CAA?@?AA	PG:Z:MarkDuplicates.30	RG:Z:C1E2N.6
                          C1E2NACXX130117:6:1102:7112:41762	89	1	12055	0	101M	=	12055	0	GAGCACTGGAGTGGAGTTTTCCTGTGGAGAGGAGCCATGCCTAGAGTGGGATGGGCCATTGTTCATCTTCTGGCCCCTGTTGTCTGCATGTAACTTAATAC	###########@CCCAAA?;=2@@66(3;@FFE?HFGIGIIIHFF@DEHGGGHDF>GDBBB:GHFBCIGHFC::;@EIIIIGCFIEBHFA?BBDDFDD<??	CC:Z:15	PG:Z:MarkDuplicates.30	RG:Z:C1E2N.6	NH:i:6	HI:i:0	NM:i:0	CP:i:102519015	UQ:i:0
                          C1DVPACXX130111:2:2210:2530:60149	83	1	12082	0	101M	=	12042	-141	AGAGGAGCCATGCCTAGAGTGGGATGGGCCATTGTTCATCTTCTGGCCCCTGTTGTCTGCATGTAACTTAATACCACAACCAGGCATAGGGGAAAGATTGG	3CDDDDDDDDDDDDDDDDDDDEEECCFEDDFHFGHHJIJJJJJJJIGJIJJJIJJIJJJJJJJJJIJJIJJIHJIHJIGJJJIJJJIJHHHHHFFFFFCCC	CC:Z:15	PG:Z:MarkDuplicates.2R	RG:Z:C1DVP.2	NH:i:5	HI:i:0	NM:i:0	CP:i:102518988	MQ:i:0	UQ:i:0
                          C1E2NACXX130117:2:1305:12347:47337	403	1	12113	0	101M	=	12040	-174	TTGTTCATCTTCTGGCCCCTGTTGTCTGCATGTAACTTAATACCACAACCAGGCATAGGGGAAAGATTGGAGGAAAGATGAGTGAGAGCGTCAACTTCTCT	?DDDCDCDDDDDBDDDB?CDFFFFHGGHHHJJIJJJIJIGF=IIJIHDHGGIIIJIIIIIJJJJIJJIJJIJJJJIJJJJIJIJJJIHF<HFHFFFFFCCC	CC:Z:15	PG:Z:MarkDuplicates.2C	RG:Z:C1E2N.2	NH:i:5	HI:i:0	NM:i:1	CP:i:102518957	MQ:i:0	UQ:i:27
                          C1E2NACXX130117:7:1102:9623:27660	355	1	12294	1	101M	=	12318	125	CCCCTACCTGCCGTCTGCTGCCATCGGAGCCCAAAGCCGGGCTGTGACTGCTCAGACCAGCCGGCTGGAGGGAGGGGCTCAGCAGGTCTGGCTTTGGCCCT	CCCFFFFFHHHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJHHHHHFFFFFEDDDDDDDDDDDDDDDDDDDDDDDCBD?CDDDDDDDDDDDB?	CC:Z:15	PG:Z:MarkDuplicates.2A	RG:Z:C1E2N.7	NH:i:3	HI:i:0	NM:i:0	CP:i:102518776	MQ:i:1	UQ:i:0
                          Link to screeshot:

                          Last edited by GenoMax; 10-27-2015, 04:04 PM. Reason: Added CODE tags

                          Comment


                          • #14
                            We have downloaded the bam file and found that the problem was caused by excessively long bam header lines.

                            We will try to fix this and release a patched version soon.

                            Comment


                            • #15
                              Thanks a lot. I will wait for that.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              11 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              51 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              68 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X