Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Have you tried cufflinks 0.9.0 beta ?

    Hello,
    I wanted to try and compare my results with the previous version, but with the same options and the same set of data, i got an error:
    Code:
    $ cufflinks_09 wt.sr.corrected.sam.sorted
    cufflinks_09: /usr/lib64/libz.so.1: no version information available (required by cufflinks_09)
    [bam_header_read] EOF marker is absent.
    File wt.sr.corrected.sam.sorted doesn't appear to be a valid BAM file, trying SAM...
    [11:01:54] Inspecting reads and determining fragment length distribution.
    Floating point exception (core dumped)
    Did you experience the same ?
    Thibault

  • #2
    Just tried 0.9.0 and I don't get a floating point exception instead I get a sorting error. Surprised me because I already sorted using the recommended command line "LC_ALL="C" sort -k 3,3 -k 4,4n input.sam > fixed.sam".

    cufflinks -o test0.9.0 -p 3 -G ../../Gallus_gallus.WASHUC2.59.gtf fixed.sam
    [bam_header_read] EOF marker is absent.
    [bam_header_read] invalid BAM binary header (this is not a BAM file).
    File fixed.sam doesn't appear to be a valid BAM file, trying SAM...
    [11:37:19] Inspecting reads and determining fragment length distribution.
    > Processing Locus 11:21898870-21900052 [* ] 5%Error: this SAM file doesn't appear to be correctly sorted!
    current hit is at 12:67036, last one was at 11_random:36741
    You may be able to fix this by running:
    $ LC_ALL="C" sort -k 3,3 -k 4,4n input.sam > fixed.sam
    Going to try using a different sort but this is definitely different from 0.8.3 which read the same fixed.sam file without issue.

    Comment


    • #3
      I also encountered the same problem using cufflinks-0.9.0.Linux_x86_64.tar.gz

      The sam file is perfectly alright for previous cufflinks but I saw the error message below:

      command: cufflinks default.acc.sam

      [bam_header_read] EOF marker is absent.
      File default.acc.sam doesn't appear to be a valid BAM file, trying SAM...
      [01:47:24] Inspecting reads and determining fragment length distribution.
      Floating point exception
      command: cufflinks -G ../v.1.0.gtf default.acc.sam

      [bam_header_read] EOF marker is absent.
      File default.acc.sam doesn't appear to be a valid BAM file, trying SAM...
      [01:50:18] Inspecting reads and determining fragment length distribution.
      > Processing Locus chr10:135515570-135516033 [******** ] 33%Error: this SAM file doesn't appear to be correctly sorted!
      current hit is at chr11:70292, last one was at chr10:135520922
      You may be able to fix this by running:
      $ LC_ALL="C" sort -k 3,3 -k 4,4n input.sam > fixed.sam
      I tried to sorted it with sort -k 3,3 -k 4,4n input.sam but the problem is not solved.
      Last edited by marcowanger; 09-28-2010, 09:54 AM. Reason: typo
      Marco

      Comment


      • #4
        Cufflinks 0.9.0 changes the rules about how it handles sorted reads - they are unfortunately now a bit more complicated because we need to support aligners other than TopHat, which may use somewhat different sort orderings.

        When you are using Cufflinks with a SAM file, the program simply require that, if a SAM header exists and contains SQ records, that the SQ records and the reads agree on the order in which chromosomes are processed. So if you have reads:

        read1 on chr1
        read2 on chr1
        read3 on chr5
        read4 on chr2

        The header SQ records need to appear in the following order:
        SQ for chr1
        SQ for chr5
        SQ for chr2

        So the question here is: do the SAM files you all are using have headers? If not, can you retry after adding headers with SQ records in the correct order? If the do, and the reads appear in the same order as the SQ records indicate, then the new version may have a bug.

        Comment


        • #5
          Thanks. The sam file I use is from tophat 0.12, so no SAM header is present. I am running tophat 0.14 now and will try it once again.


          Originally posted by Cole Trapnell View Post
          Cufflinks 0.9.0 changes the rules about how it handles sorted reads - they are unfortunately now a bit more complicated because we need to support aligners other than TopHat, which may use somewhat different sort orderings.

          When you are using Cufflinks with a SAM file, the program simply require that, if a SAM header exists and contains SQ records, that the SQ records and the reads agree on the order in which chromosomes are processed. So if you have reads:

          read1 on chr1
          read2 on chr1
          read3 on chr5
          read4 on chr2

          The header SQ records need to appear in the following order:
          SQ for chr1
          SQ for chr5
          SQ for chr2

          So the question here is: do the SAM files you all are using have headers? If not, can you retry after adding headers with SQ records in the correct order? If the do, and the reads appear in the same order as the SQ records indicate, then the new version may have a bug.
          Marco

          Comment


          • #6
            I should also add that when you are using a SAM file *without* a header, AND you are using a GTF file, Cufflinks expects the reads to such that the chromosomes are processed in the order in which they exist in the GTF file. This is admittedly pretty idiosyncratic, especially considering that (IIRC) GTF doesn't require that records be sorted at all. Thus, we generally recommend the use of headers in SAM files, or ideally the direct use of BAM.

            Comment


            • #7
              Hmm the SAM files I am using were output with TopHat 1.0.14 and appear to not contain SQ records. I suppose I can manually add them but is there something else I could do?

              Comment


              • #8
                Originally posted by scozza View Post
                Hmm the SAM files I am using were output with TopHat 1.0.14 and appear to not contain SQ records. I suppose I can manually add them but is there something else I could do?
                You could try sorting the GTF file using something like:

                Code:
                LC_ALL="C" sort -k 1,1 -k 4,4n input.gtf > input.fixed.gtf
                This should exploit the fact that I mention above, that without a header, Cufflinks looks at the GTF for the sort order for the reads. Since your reads come from TopHat, the chromosomes will appear in lexicographic order.

                TopHat 1.1 will report BAM output by default, but we are waiting on that release for a little bit more testing.

                Comment


                • #9
                  Originally posted by Cole Trapnell View Post
                  Code:
                  LC_ALL="C" sort -k 1,1 -k 4,4n input.gtf > input.fixed.gtf
                  This should exploit the fact that I mention above, that without a header, Cufflinks looks at the GTF for the sort order for the reads. Since your reads come from TopHat, the chromosomes will appear in lexicographic order.
                  That didn't work either. At this point I have tried sorting the SAM file and the GTF file but to no avail. I then tried converting the SAM file into BAM using samtools and an indexed reference fasta file but that too produced a similar error message about sorting. The latest thing I have tries is reconverting from BAM back to SAM but with SQ records written but that too fails.

                  I'll keep plugging at it.

                  Comment


                  • #10
                    OK, we are not seeing that on our internal runs. If you can send me a small BAM file and GTF snippet that reproduces this issue, I will take a look.

                    Comment


                    • #11
                      An update has been released to address the Floating Point Exception issue.

                      -Adam

                      Comment


                      • #12
                        Thanks.

                        The new version fixed my problem, so far.

                        In the meantime (before the release of BAM output tophat), we should stick with a fixed gtf (as told by Cole)

                        Originally posted by adarob View Post
                        An update has been released to address the Floating Point Exception issue.

                        -Adam
                        Marco

                        Comment


                        • #13
                          I run Cufflinks v0.9.0 with default options but setting -N and -r. When I use the -G option (reference annotation gtf) I have some transcripts in the transcripts.gtf output that are no present in the output when running without the reference annotation, I checked this result with IGV. Any idea about why I miss some reference transcripts?

                          Comment


                          • #14
                            THe problem for which i open this discussion was solved with the updated version of cufflinks.
                            Thanks

                            Comment


                            • #15
                              I was able to run cufflinks and cuffcompare successfully with newer version of cuffllinks (v0.9.0). However, I ran into the problem previously discussed in this thread when I ran cuffdiff. I got an error message saying "Error: this SAM file doesn't appear to be correctly sorted!". I tried sorting SAM files and sorting GTF file as suggested by Cole, but the problem persists. Has anyone had similar problem when running cuffdiff? When cufflinks ran without complaining why there is an error message when running cuffdiff? I am using the same SAM files with cufflinks and cuffdiff.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              7 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              7 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              49 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              66 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X