Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cuffmerge question - manifest files

    Hi,

    So I'm new to using cuffmerge. I have all of my .bam files from the output of cufflinks and want to combine them using cuffmerge. I'm a bit confused about the command line usage of cuffmerge. It says to 'input GTF files must be specified in a “manifest” file listing full paths to the files'. I'm vaguely aware of what a manifest file is, but not really sure how to go about making one from my .gtf files.

    Any help?

    Thanks

  • #2
    Cuffmerge is not going to combine your bam files but it is going to merge several Cufflinks assemblies (gtf files) to give you a common representation.

    Assemblies.txt is a simple text file (that you can make using an editor) where you specify paths to gtf files from each of the tophat runs, one sample per line. Like this

    Code:
    /path_to/sample1_clout/transcripts.gtf
    /path_to/sample2_clout/transcripts.gtf
    /path_to/sample3_clout/transcripts.gtf
    Replace "path_to" with real file paths on your system.

    Comment


    • #3
      Yeah sorry meant .gtf files! Perfect - that seems easy. Thanks.

      Comment


      • #4
        BTW, any idea where the most recent version of cuffmerge is located? I just downloaded the most recent Cufflinks package from here http://cole-trapnell-lab.github.io/cufflinks/install/ - but when I run...

        Code:
        user@IT060209:~/Downloads/cufflinks$ /home/user/Downloads/cufflinks/cuffmerge
        It tells me that it's merge_cuff_asms v1.0.0 which is an old version, giving me a few errors later on. Is there a separate file for the newer cuffmerge download? Strange because when I run the cufflinks program from the same downloaded folder, it runs as the new version which is 2.1.1

        Thanks!
        Last edited by 4galaxy7; 11-18-2015, 05:52 AM.

        Comment


        • #5
          Did you get the binaries or source? What does $ cufflinks -h show?

          Comment


          • #6
            I downloaded the Linux binaries.

            Code:
            user@IT060209:~$ cufflinks -h
            cufflinks: invalid option -- 'h'
            cufflinks v2.1.1
            Last edited by 4galaxy7; 11-18-2015, 06:32 AM.

            Comment


            • #7
              What errors are you seeing or are they warnings?

              Comment


              • #8
                Well it comes out with this...

                Code:
                user@IT060209:~/Downloads/cufflinks$ cuffmerge -o /home/user/Desktop/sam/cuffmerge_out -s /home/user/Downloads/DinoAntAssembly3.fna.1.fasta /home/user/Desktop/sam/gtf_manifest.txt
                [Wed Nov 18 14:53:07 2015] Beginning transcriptome assembly merge
                -------------------------------------------
                
                [Wed Nov 18 14:53:07 2015] Preparing output location /home/user/Desktop/sam/cuffmerge_out/
                Warning: no reference GTF provided!
                [Wed Nov 18 14:53:08 2015] Converting GTF files to SAM
                [14:53:08] Loading reference annotation.
                [14:53:09] Loading reference annotation.
                [14:53:10] Loading reference annotation.
                [14:53:10] Loading reference annotation.
                [14:53:11] Loading reference annotation.
                [14:53:12] Loading reference annotation.
                [14:53:12] Loading reference annotation.
                [14:53:13] Loading reference annotation.
                [14:53:13] Loading reference annotation.
                [14:53:14] Loading reference annotation.
                [14:53:14] Loading reference annotation.
                [14:53:15] Loading reference annotation.
                [Wed Nov 18 14:53:16 2015] Assembling transcripts
                Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
                Command line:
                cufflinks -o /home/user/Desktop/sam/cuffmerge_out/ -F 0.05 -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 1 /home/user/Desktop/sam/cuffmerge_out/tmp/mergeSam_fileil7A8s 
                [bam_header_read] EOF marker is absent. The input is probably truncated.
                [bam_header_read] invalid BAM binary header (this is not a BAM file).
                File /home/user/Desktop/sam/cuffmerge_out/tmp/mergeSam_fileil7A8s doesn't appear to be a valid BAM file, trying SAM...
                [14:53:16] Inspecting reads and determining fragment length distribution.
                Processed 10522 loci.                       
                > Map Properties:
                >	Normalized Map Mass: 286548.00
                >	Raw Map Mass: 286548.00
                >	Fragment Length Distribution: Truncated Gaussian (default)
                >	              Default Mean: 200
                >	           Default Std Dev: 80
                [14:53:18] Assembling transcripts and estimating abundances.
                Processed 10548 loci.                       
                [Wed Nov 18 14:54:34 2015] Comparing against reference file None
                Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
                [Wed Nov 18 14:54:39 2015] Comparing against reference file None
                Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
                It produces a seemingly OK merged file, but given that it had several errors/warnings, I'd quite like to use it with the most up to date version of cuffmerge to make sure there's no errors in the file.

                Comment


                • #9
                  You have cufflinks v.2.1.1 showing up as the default (post #6, if you just run cufflinks) since that seems to be in your $PATH. Your computer/server must have that installed somewhere.

                  If you did download the latest from GitHub then you should explicitly run that version by giving its full path (which I assume is: /home/user/Downloads/cufflinks/). If you do have admin rights you could try and update the default cufflinks to the new version).

                  What do you see if you do $ /home/user/Downloads/cufflinks/cufflinks -h ?
                  Last edited by GenoMax; 11-18-2015, 07:03 AM.

                  Comment


                  • #10
                    Code:
                    user@IT060209:~$ /home/user/Downloads/cufflinks/cuffmerge -v
                    merge_cuff_asms v1.0.0
                    
                    user@IT060209:~$ /home/user/Downloads/cufflinks/cufflinks -v
                    cufflinks v2.2.1
                    That's what I tried originally, which is why I was confused. It's like it's downloaded the new version of all the other cuff-programmes (so for instance for cuffdiff is version 2.1.1), but for some reason the version of cuffmerge is the old one. I've tried running cuffmerge from my PATH and from the folder itself and it gives the same old version. Strange..

                    Do you think the warnings are anything to worry about? I had the EOF header error before in tophat2 because I was using an old version.

                    EDIT: I didn't download it from github, I downloaded it from here http://cole-trapnell-lab.github.io/cufflinks/install/ - it said that the new downloads were now hosted by Github, but I wasn't sure where to find the binaries there.
                    Last edited by 4galaxy7; 11-18-2015, 07:07 AM.

                    Comment


                    • #11
                      I see

                      Code:
                      $ cuffmerge -v
                      merge_cuff_asms v1.0.0
                      
                      $ cuffdiff -v
                      cuffdiff v2.2.1 (4237)
                      with the latest cufflinks (v.2.2.1). I would not worry about this since cuffmerge looks to be using that version of merge_cuff_asms internally.

                      Because your have an older version of cufflinks in your default $PATH those warnings must be getting generated. You could append /home/user/Downloads/cufflinks/ to your $PATH and see if that makes the warnings about the older cufflinks go away.

                      As for that error with BAM, did you use the latest TopHat?
                      Last edited by GenoMax; 11-18-2015, 07:22 AM.

                      Comment


                      • #12
                        Yeah, I have the most recent version running. I tried to change the PATH, but that didn't seem to work.

                        As a query, roughly sized files should I be expecting as the output of cufflinks/cuffmerge? For instance, I put in a 7.1gb fastq file into cufflinks and got a resulting .gtf of abour 22mB, which seems kind of small?

                        Comment


                        • #13
                          Since the GTF file only has co-ordinates having a file that seems to be small compared to the alignments that went in is ok.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM
                          • seqadmin
                            Techniques and Challenges in Conservation Genomics
                            by seqadmin



                            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                            Avian Conservation
                            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                            03-08-2024, 10:41 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Yesterday, 06:37 PM
                          0 responses
                          10 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, Yesterday, 06:07 PM
                          0 responses
                          9 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-22-2024, 10:03 AM
                          0 responses
                          49 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-21-2024, 07:32 AM
                          0 responses
                          67 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X