Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I ran 2 x 50GB SAM files that had been generated by TopHat 1.0.14 through the pre-release Cufflinks 0.8.4 using -G and a UCSC hg19 GTF annotation file:

    cufflinks -p10 -G /gtf_files/hg19_UCSC.gtf accepted_hits.sam

    Then through Cuffcompare 0.8.4:

    cuffcompare -r /gtf_files/hg19_UCSC.gtf /normal/transcripts.gtf /cancer/transcripts.gtf

    Then through Cuffdiff 0.8.4:

    cuffdiff -p14 stdout.combined.gtf /normal/tophat_out/accepted_hits.sam /cancer/tophat_out/accepted_hits.sam

    No problems encountered.

    Thanks Cole for the fix and making the pre-release version of Cufflinks available.

    Chris

    Comment


    • #17
      Originally posted by Cole Trapnell View Post
      I just posted a pre-release build of Cufflinks 0.8.4 (svn r1370) that addresses this issue. I was able to locally reproduce the bug with Uwe's test set (thanks!), and the build I've posted corrects it (for me, anyways). Please let me know if this crops up again (ideally with another test download to reproduce it).

      You should know before trying this build that it is an SVN snapshot and hasn't gone through the normal round of release testing, so you may run into other issues.

      This build also includes additional assembler fixes and direct BAM file support. You can now supply BAM files as input to Cufflinks (SAM is still supported).

      Please download one of the tarballs with "0.8.4" in the version tag from:

      http://cufflinks.cbcb.umd.edu/downloads/
      Hi Cole,

      first of all, thanks for the quick bugfix! At a first glance, everything works fine now, at least i get no more duplicates. However i receive segmentation faults with almost any sample i try to cufflink. This is (very) most likely not a memory-problem and appears to have been introduced in v0.8.4. (haven't seen it in v0.8.2/3).
      I've used the linux-binaries you provided @ debian.lenny.x64 (48GB Ram, sizes of accepted_hits.sam: 6-8Gb) with no ulimits and no additional processes in background.
      It seems (i could could also say: i've got the feeling) that particularly paired-end anaylses (2x50bp) are affected, whereas single-end analyses (1x50bp) are less affected and especially 36Bp-reads aren't, at all.

      Any ideas? Do you need a test-download for that?

      Best,
      Uwe
      Last edited by Uwe Appelt; 07-19-2010, 08:09 AM.

      Comment


      • #18
        Originally posted by winfried View Post
        Hi Cole,

        first of all, thanks for the quick bugfix! At a first glance, everything works fine now, at least i get no more duplicates. However i receive segmentation faults with almost any sample i try to cufflink. This is (very) most likely not a memory-problem and appears to have been introduced in v0.8.4. (haven't seen it in v0.8.2/3).
        I've used the linux-binaries you provided @ debian.lenny.x64 (48GB Ram, sizes of accepted_hits.sam: 6-8Gb) with no ulimits and no additional processes in background.
        It seems (i could could also say: i've got the feeling) that particularly paired-end anaylses (2x50bp) are affected, whereas single-end analyses (1x50bp) are less affected and especially 36Bp-reads aren't, at all.

        Any ideas? Do you need a test-download for that?

        Best,
        Uwe
        Sure - I've run the build I sent you on a bunch of different samples (mostly 1x75, 2x75, and 2x50, in mouse and human), and I haven't gotten any segfaults. I'd be grateful for another small sample. There was a lot of code that changed in between 0.8.3 and the pre-release I posted (mostly related to BAM support), so this isn't too surprising.

        I'm actually out of town this week, so I may not get to this as fast as the last fix, but I *will* get to it

        Comment


        • #19
          Originally posted by Cole Trapnell View Post
          Sure - I've run the build I sent you on a bunch of different samples (mostly 1x75, 2x75, and 2x50, in mouse and human), and I haven't gotten any segfaults. I'd be grateful for another small sample. There was a lot of code that changed in between 0.8.3 and the pre-release I posted (mostly related to BAM support), so this isn't too surprising.

          I'm actually out of town this week, so I may not get to this as fast as the last fix, but I *will* get to it
          Hi Cole,

          thanks for being *there* anyway. Although i even tried to played around with the command line parameters to narrow down the error source, I still and constantly receive the segmentation faults. I've thus prepared another test set:
          Code:
          http://www.gtsg.org/_/COLE2.tar.gz
          It would be great to know, whether you're able to reproduce the segfaults with the cmd-line:
          Code:
          cufflinks --num-threads 8 --max-mle-iterations 10000 --GTF ./Homo_sapiens.GRCh37.58.name=id.gtf --output-dir ./ ./accepted_hits.sam
          Thanks in advance and best!
          Uwe

          Comment


          • #20
            I had the cufflinks release from 0.8.3 release - 6/30/2010 and everything worked for me.
            I updated to the latest version released on 0.8.3 build update on 7/2/2010 and get this error:
            Error: duplicate GFF ID '' encountered!

            the 0.8.4 release gives me a segmentation fault.
            This is so frustrating because its the opposite of what others in this post are seeing.Now the 0.8.3 release on 6/30/2010 is not even available to download.
            Last edited by thinkRNA; 07-29-2010, 08:51 AM.

            Comment


            • #21
              Cole, is it possible for you to upload the 0.8.3 version released on 6/30/2010, unless you think there was a algorithm problem in it.

              Comment


              • #22
                Originally posted by thinkRNA View Post
                I had the cufflinks release from 0.8.3 release - 6/30/2010 and everything worked for me.
                I updated to the latest version released on 0.8.3 build update on 7/2/2010 and get this error:
                Error: duplicate GFF ID '' encountered!

                the 0.8.4 release gives me a segmentation fault.
                This is so frustrating because its the opposite of what others in this post are seeing.Now the 0.8.3 release on 6/30/2010 is not even available to download.
                Not sure, why you think your experiences to be the opposite of what others desribe - it's exactly what the latest posts are about, isn't it.

                Comment


                • #23
                  Originally posted by thinkRNA View Post
                  Cole, is it possible for you to upload the 0.8.3 version released on 6/30/2010, unless you think there was a algorithm problem in it.
                  There were a few minor but annoying problems with the build on 6/30, which is why I replaced it with the build of 0.8.3 released on 7/5. That download is still available at:



                  Does that version generate this issue for you?

                  Comment


                  • #24
                    Originally posted by Cole Trapnell View Post
                    There were a few minor but annoying problems with the build on 6/30, which is why I replaced it with the build of 0.8.3 released on 7/5. That download is still available at:



                    Does that version generate this issue for you?
                    I believe the version you posted on 6/30 didn't give me a duplicate GFF error (same gene with different FPKMs) but the 7/5 one did. In any case, it may just be better now if you can get rid of the segmentation fault error on 0.8.4 that has been also reported by winfried as well and I hope that will take care of the duplicate record error as well.

                    Cole, I am getting differential gene expression between replicate samples as well. In your experience, at default FDR, how many genes should I "EXPECT" to be reported as diff. expressed between replicates using cufflinks in mouse illumina single end reads of 75bp?

                    I am thinking of computing empirical FDR by taking into account the number of genes reported as differentially expressed between replicates.

                    Comment


                    • #25
                      Still getting the segmentation fault myself with 0.8.4 and like thinkRNA 0.8.3 from 6/30 had worked but the 7/5 one didn't. Unfortunately I had removed the 6/30 one without a back up and now I can't go back. Is there any update on when 0.8.4 will be fixed or does anyone have a clue as to why both the Linux and MacOS versions are giving the seg fault?

                      Comment


                      • #26
                        @scozza, I have been waiting for 2 months now for a newer release without the segmentation fault, but seems like the authors are very busy.

                        Comment


                        • #27
                          I am wondering why others appear to be able to run 0.8.4 without a seq fault but both my Linux system and Mac can't. Maybe if I knew I could find a suitable machine.

                          At this point I am tempted to abandon using the Ensembl GTF file and switching to a UCSC one which I know has other slightly minor issues but does not fail completely with a duplicate id error.

                          Comment


                          • #28
                            I btw started to remove the duplicate entries manually, which can be done quickly unless there are not too many of them. And it appears there are very few duplicates - i managed to correct four samples @ 6 lanes in less than 10 minutes, just by invoking cuffcompare again and again, to let it report the duplicates and removing them step by step. So if you guys do not have to deal with that many samples, it's probably worth a try?

                            (also still using (the later) 0.8.3 coz 0.8.4 produces segfaults)

                            Cheers
                            uwe

                            Comment


                            • #29
                              Originally posted by griffon42 View Post
                              Hi all-

                              First off, thank you Cole for Cufflinks.
                              Have people been having luck with Cufflinks? I generated some robust simulated data and benchmarked it and found that it has a false positive rate of something like 98%. I also benchmarked Scripture and it has a FP rate of around 99.9%. The scripture paper validates its FN rate but concludes it has validated a FP rate and that got past the reviewers. The FP rate is actually extremely high. Cufflinks is only slightly better.

                              Comment


                              • #30
                                Greg,

                                How do you define false positive rate in your comparisons?

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 06:37 PM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, Yesterday, 06:07 PM
                                0 responses
                                9 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                49 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                67 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X