SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Cuffmerge question - manifest files (http://seqanswers.com/forums/showthread.php?t=64345)

4galaxy7 11-18-2015 03:29 AM

Cuffmerge question - manifest files
 
Hi,

So I'm new to using cuffmerge. I have all of my .bam files from the output of cufflinks and want to combine them using cuffmerge. I'm a bit confused about the command line usage of cuffmerge. It says to 'input GTF files must be specified in a “manifest” file listing full paths to the files'. I'm vaguely aware of what a manifest file is, but not really sure how to go about making one from my .gtf files.

Any help?

Thanks

GenoMax 11-18-2015 03:36 AM

Cuffmerge is not going to combine your bam files but it is going to merge several Cufflinks assemblies (gtf files) to give you a common representation.

Assemblies.txt is a simple text file (that you can make using an editor) where you specify paths to gtf files from each of the tophat runs, one sample per line. Like this

Code:

/path_to/sample1_clout/transcripts.gtf
/path_to/sample2_clout/transcripts.gtf
/path_to/sample3_clout/transcripts.gtf

Replace "path_to" with real file paths on your system.

4galaxy7 11-18-2015 03:47 AM

Yeah sorry meant .gtf files! Perfect - that seems easy. Thanks.

4galaxy7 11-18-2015 04:49 AM

BTW, any idea where the most recent version of cuffmerge is located? I just downloaded the most recent Cufflinks package from here http://cole-trapnell-lab.github.io/cufflinks/install/ - but when I run...

Code:

user@IT060209:~/Downloads/cufflinks$ /home/user/Downloads/cufflinks/cuffmerge
It tells me that it's merge_cuff_asms v1.0.0 which is an old version, giving me a few errors later on. Is there a separate file for the newer cuffmerge download? Strange because when I run the cufflinks program from the same downloaded folder, it runs as the new version which is 2.1.1

Thanks!

GenoMax 11-18-2015 04:56 AM

Did you get the binaries or source? What does $ cufflinks -h show?

4galaxy7 11-18-2015 05:27 AM

I downloaded the Linux binaries.

Code:

user@IT060209:~$ cufflinks -h
cufflinks: invalid option -- 'h'
cufflinks v2.1.1


GenoMax 11-18-2015 05:47 AM

What errors are you seeing or are they warnings?

4galaxy7 11-18-2015 05:55 AM

Well it comes out with this...

Code:

user@IT060209:~/Downloads/cufflinks$ cuffmerge -o /home/user/Desktop/sam/cuffmerge_out -s /home/user/Downloads/DinoAntAssembly3.fna.1.fasta /home/user/Desktop/sam/gtf_manifest.txt
[Wed Nov 18 14:53:07 2015] Beginning transcriptome assembly merge
-------------------------------------------

[Wed Nov 18 14:53:07 2015] Preparing output location /home/user/Desktop/sam/cuffmerge_out/
Warning: no reference GTF provided!
[Wed Nov 18 14:53:08 2015] Converting GTF files to SAM
[14:53:08] Loading reference annotation.
[14:53:09] Loading reference annotation.
[14:53:10] Loading reference annotation.
[14:53:10] Loading reference annotation.
[14:53:11] Loading reference annotation.
[14:53:12] Loading reference annotation.
[14:53:12] Loading reference annotation.
[14:53:13] Loading reference annotation.
[14:53:13] Loading reference annotation.
[14:53:14] Loading reference annotation.
[14:53:14] Loading reference annotation.
[14:53:15] Loading reference annotation.
[Wed Nov 18 14:53:16 2015] Assembling transcripts
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Command line:
cufflinks -o /home/user/Desktop/sam/cuffmerge_out/ -F 0.05 -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 1 /home/user/Desktop/sam/cuffmerge_out/tmp/mergeSam_fileil7A8s
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File /home/user/Desktop/sam/cuffmerge_out/tmp/mergeSam_fileil7A8s doesn't appear to be a valid BAM file, trying SAM...
[14:53:16] Inspecting reads and determining fragment length distribution.
Processed 10522 loci.                     
> Map Properties:
>        Normalized Map Mass: 286548.00
>        Raw Map Mass: 286548.00
>        Fragment Length Distribution: Truncated Gaussian (default)
>                      Default Mean: 200
>                  Default Std Dev: 80
[14:53:18] Assembling transcripts and estimating abundances.
Processed 10548 loci.                     
[Wed Nov 18 14:54:34 2015] Comparing against reference file None
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
[Wed Nov 18 14:54:39 2015] Comparing against reference file None
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).

It produces a seemingly OK merged file, but given that it had several errors/warnings, I'd quite like to use it with the most up to date version of cuffmerge to make sure there's no errors in the file.

GenoMax 11-18-2015 06:01 AM

You have cufflinks v.2.1.1 showing up as the default (post #6, if you just run cufflinks) since that seems to be in your $PATH. Your computer/server must have that installed somewhere.

If you did download the latest from GitHub then you should explicitly run that version by giving its full path (which I assume is: /home/user/Downloads/cufflinks/). If you do have admin rights you could try and update the default cufflinks to the new version).

What do you see if you do $ /home/user/Downloads/cufflinks/cufflinks -h ?

4galaxy7 11-18-2015 06:05 AM

Code:

user@IT060209:~$ /home/user/Downloads/cufflinks/cuffmerge -v
merge_cuff_asms v1.0.0

user@IT060209:~$ /home/user/Downloads/cufflinks/cufflinks -v
cufflinks v2.2.1

That's what I tried originally, which is why I was confused. It's like it's downloaded the new version of all the other cuff-programmes (so for instance for cuffdiff is version 2.1.1), but for some reason the version of cuffmerge is the old one. I've tried running cuffmerge from my PATH and from the folder itself and it gives the same old version. Strange..

Do you think the warnings are anything to worry about? I had the EOF header error before in tophat2 because I was using an old version.

EDIT: I didn't download it from github, I downloaded it from here http://cole-trapnell-lab.github.io/cufflinks/install/ - it said that the new downloads were now hosted by Github, but I wasn't sure where to find the binaries there.

GenoMax 11-18-2015 06:17 AM

I see

Code:

$ cuffmerge -v
merge_cuff_asms v1.0.0

$ cuffdiff -v
cuffdiff v2.2.1 (4237)

with the latest cufflinks (v.2.2.1). I would not worry about this since cuffmerge looks to be using that version of merge_cuff_asms internally.

Because your have an older version of cufflinks in your default $PATH those warnings must be getting generated. You could append /home/user/Downloads/cufflinks/ to your $PATH and see if that makes the warnings about the older cufflinks go away.

As for that error with BAM, did you use the latest TopHat?

4galaxy7 11-23-2015 01:52 AM

Yeah, I have the most recent version running. I tried to change the PATH, but that didn't seem to work.

As a query, roughly sized files should I be expecting as the output of cufflinks/cuffmerge? For instance, I put in a 7.1gb fastq file into cufflinks and got a resulting .gtf of abour 22mB, which seems kind of small?

GenoMax 11-23-2015 03:35 AM

Since the GTF file only has co-ordinates having a file that seems to be small compared to the alignments that went in is ok.


All times are GMT -8. The time now is 05:48 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.