SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Cuffmerge Error ercfrtz Bioinformatics 12 05-19-2014 10:02 PM
cuffmerge - error reading ref.gtf crh Bioinformatics 2 10-13-2012 08:05 AM
cuffmerge error wangli RNA Sequencing 5 07-19-2012 10:08 AM
Cuffmerge error with gencode v12 gtf file vkartha Bioinformatics 0 07-03-2012 10:06 PM
Cuffmerge Error Allocating Memory AsoBioInfo Bioinformatics 0 05-27-2012 09:47 PM

Reply
 
Thread Tools
Old 08-10-2012, 05:43 AM   #1
Richard Barker
Member
 
Location: Madison wisconsin

Join Date: Apr 2012
Posts: 47
Default Cuffmerge error coultn't finda data file for Mt or Pt?

I'm tring to merge my transcript data produced from 8 samples analysed using TopHat and compiled using Cufflinks. Unfortunately when i run cuffmerge I get an error message about "missing fasta data files for 'Mt' and 'Pt'" but can't find reference to these files anywhere else in the literature or on-line?
Any advice greatfully received.

richard@ubuntu:~/RNA_seq_analysis/Cuffmerge$ cuffmerge -g arabidopsis_thaliana.TAIR10.60.gtf -s TAIR10_chr_all.fas -p 6 run297_transcript_cuffmerge.txt

[Thu Aug 9 15:36:29 2012] Beginning transcriptome assembly merge
-------------------------------------------

[Thu Aug 9 15:36:29 2012] Preparing output location ./merged_asm/
[Thu Aug 9 15:36:36 2012] Converting GTF files to SAM
[15:36:36] Loading reference annotation.
[15:36:37] Loading reference annotation.
[15:36:38] Loading reference annotation.
[15:36:39] Loading reference annotation.
[15:36:40] Loading reference annotation.
[15:36:41] Loading reference annotation.
[15:36:42] Loading reference annotation.
[15:36:44] Loading reference annotation.
[Thu Aug 9 15:36:45 2012] Quantitating transcripts
You are using Cufflinks v2.0.2, which is the most recent release.
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g arabidopsis_thaliana.TAIR10.60.gtf -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 6 ./merged_asm/tmp/mergeSam_filejHHJWI
[bam_header_read] EOF marker is absent.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_filejHHJWI doesn't appear to be a valid BAM file, trying SAM...
[15:36:45] Loading reference annotation.
[15:36:47] Inspecting reads and determining fragment length distribution.
Processed 26332 loci.
> Map Properties:
> Normalized Map Mass: 194074.00
> Raw Map Mass: 194074.00
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
[15:36:48] Assembling transcripts and estimating abundances.
Processed 26332 loci.
[Thu Aug 9 15:44:22 2012] Comparing against reference file arabidopsis_thaliana.TAIR10.60.gtf
You are using Cufflinks v2.0.2, which is the most recent release.
No fasta index found for TAIR10_chr_all.fas. Rebuilding, please wait..
Fasta index rebuilt.
Warning: couldn't find fasta record for 'Mt'!
Warning: couldn't find fasta record for 'Pt'!
[Thu Aug 9 15:44:34 2012] Comparing against reference file arabidopsis_thaliana.TAIR10.60.gtf
You are using Cufflinks v2.0.2, which is the most recent release.
Warning: couldn't find fasta record for 'Mt'!
Warning: couldn't find fasta record for 'Pt'!
Richard Barker is offline   Reply With Quote
Old 08-10-2012, 06:12 AM   #2
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,170
Default

Richard,

This is most likely caused by a mismatch between the reference names between the fasta file and the gtf file you are using.

The official TAIR10 genome sequence release names the mitochondrial and plastid (chloroplast) chromsomes ChrM and ChrC respectively. These are the names used in TAIR10_chr_all.fas. It would appear that your reference GTF file, arabidopsis_thaliana.TAIR10.60.gtf, is from a different source and uses different names (Mt and Pt). You will need to make sure the chromosome names match exactly between your FASTA and GTF files.
kmcarr is offline   Reply With Quote
Old 08-10-2012, 07:56 AM   #3
Richard Barker
Member
 
Location: Madison wisconsin

Join Date: Apr 2012
Posts: 47
Default

Thanks for the swift response (again)

Your advice worked perfectly, i searched the directories near where i downloaded the TAIR10.fasta file and found a TAIR10_GFF3_genes.gff file.

The following script appears to be working, but whats the difference between a GFF and GTF file?

cuffmerge -g TAIR10_GFF3_genes -s TAIR10_chr_all.fas -p 6 run297_transcript_cuffmerge.txt
Richard Barker is offline   Reply With Quote
Old 08-10-2012, 08:07 AM   #4
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,170
Default

Quote:
The following script appears to be working, but whats the difference between a GFF and GTF file?
Cufflinks documentation
kmcarr is offline   Reply With Quote
Old 08-10-2012, 08:22 AM   #5
Richard Barker
Member
 
Location: Madison wisconsin

Join Date: Apr 2012
Posts: 47
Default

Ooops spoke too soon now i get the following error message?

richard@ubuntu:~/RNA_seq_analysis/Cuffmerge$ cuffmerge -g TAIR10_GFF3_genes.gff -s TAIR10_chr_all.fas -p 6 run297_transcript_cuffmerge.txt

[Fri Aug 10 07:41:48 2012] Beginning transcriptome assembly merge
-------------------------------------------

[Fri Aug 10 07:41:48 2012] Preparing output location ./merged_asm/
[Fri Aug 10 07:41:52 2012] Converting GTF files to SAM
[07:41:52] Loading reference annotation.
[07:41:53] Loading reference annotation.
[07:41:54] Loading reference annotation.
[07:41:56] Loading reference annotation.
[07:41:57] Loading reference annotation.
[07:41:58] Loading reference annotation.
[07:41:59] Loading reference annotation.
[07:42:00] Loading reference annotation.
[Fri Aug 10 07:42:02 2012] Quantitating transcripts
You are using Cufflinks v2.0.2, which is the most recent release.
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g TAIR10_GFF3_genes.gff -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 6 ./merged_asm/tmp/mergeSam_filefWraGs
[bam_header_read] EOF marker is absent.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_filefWraGs doesn't appear to be a valid BAM file, trying SAM...
[07:42:02] Loading reference annotation.
[07:42:03] Inspecting reads and determining fragment length distribution.
Processed 47416 loci.
> Map Properties:
> Normalized Map Mass: 194074.00
> Raw Map Mass: 194074.00
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
[07:42:05] Assembling transcripts and estimating abundances.
Processed 47416 loci.
[Fri Aug 10 07:53:04 2012] Comparing against reference file TAIR10_GFF3_genes.gff
You are using Cufflinks v2.0.2, which is the most recent release.
Warning: couldn't find fasta record for 'Chr1'!
Warning: couldn't find fasta record for 'Chr2'!
Warning: couldn't find fasta record for 'Chr3'!
Warning: couldn't find fasta record for 'Chr4'!
Warning: couldn't find fasta record for 'Chr5'!
Warning: couldn't find fasta record for 'ChrC'!
Warning: couldn't find fasta record for 'ChrM'!
[Fri Aug 10 07:53:20 2012] Comparing against reference file TAIR10_GFF3_genes.gff
You are using Cufflinks v2.0.2, which is the most recent release.
Warning: couldn't find fasta record for 'Chr1'!
Warning: couldn't find fasta record for 'Chr2'!
Warning: couldn't find fasta record for 'Chr3'!
Warning: couldn't find fasta record for 'Chr4'!
Warning: couldn't find fasta record for 'Chr5'!
Warning: couldn't find fasta record for 'ChrC'!
Warning: couldn't find fasta record for 'ChrM'!
Richard Barker is offline   Reply With Quote
Old 08-10-2012, 08:32 AM   #6
Richard Barker
Member
 
Location: Madison wisconsin

Join Date: Apr 2012
Posts: 47
Default

I've found a TAIR10_GFF file (ftp://ftp.arabidopsis.org/home/tair/...enome_release/) which was also near the location where i downloaded my genome fasta file (ftp://ftp.arabidopsis.org/home/tair/...omosome_files/) and one was able to completed the alignment!
Thanks for your help!
Richard Barker is offline   Reply With Quote
Old 08-15-2012, 02:39 PM   #7
Richard Barker
Member
 
Location: Madison wisconsin

Join Date: Apr 2012
Posts: 47
Default

Shouldn't the cuffmerge out put have the gene names (Arabidopsis ATG codes?). What methods are there for adding your genome annotation, i thought that was the reason for using the GFF/gtf files during TopHat and/or cuffmerge?
Richard Barker is offline   Reply With Quote
Old 08-02-2017, 09:49 AM   #8
shinigam123
Junior Member
 
Location: Mexico

Join Date: Aug 2017
Posts: 3
Default

I have the same problem, How you solve it?






Quote:
Originally Posted by Richard Barker View Post
Ooops spoke too soon now i get the following error message?

richard@ubuntu:~/RNA_seq_analysis/Cuffmerge$ cuffmerge -g TAIR10_GFF3_genes.gff -s TAIR10_chr_all.fas -p 6 run297_transcript_cuffmerge.txt

[Fri Aug 10 07:41:48 2012] Beginning transcriptome assembly merge
-------------------------------------------

[Fri Aug 10 07:41:48 2012] Preparing output location ./merged_asm/
[Fri Aug 10 07:41:52 2012] Converting GTF files to SAM
[07:41:52] Loading reference annotation.
[07:41:53] Loading reference annotation.
[07:41:54] Loading reference annotation.
[07:41:56] Loading reference annotation.
[07:41:57] Loading reference annotation.
[07:41:58] Loading reference annotation.
[07:41:59] Loading reference annotation.
[07:42:00] Loading reference annotation.
[Fri Aug 10 07:42:02 2012] Quantitating transcripts
You are using Cufflinks v2.0.2, which is the most recent release.
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g TAIR10_GFF3_genes.gff -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 6 ./merged_asm/tmp/mergeSam_filefWraGs
[bam_header_read] EOF marker is absent.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_filefWraGs doesn't appear to be a valid BAM file, trying SAM...
[07:42:02] Loading reference annotation.
[07:42:03] Inspecting reads and determining fragment length distribution.
Processed 47416 loci.
> Map Properties:
> Normalized Map Mass: 194074.00
> Raw Map Mass: 194074.00
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
[07:42:05] Assembling transcripts and estimating abundances.
Processed 47416 loci.
[Fri Aug 10 07:53:04 2012] Comparing against reference file TAIR10_GFF3_genes.gff
You are using Cufflinks v2.0.2, which is the most recent release.
Warning: couldn't find fasta record for 'Chr1'!
Warning: couldn't find fasta record for 'Chr2'!
Warning: couldn't find fasta record for 'Chr3'!
Warning: couldn't find fasta record for 'Chr4'!
Warning: couldn't find fasta record for 'Chr5'!
Warning: couldn't find fasta record for 'ChrC'!
Warning: couldn't find fasta record for 'ChrM'!
[Fri Aug 10 07:53:20 2012] Comparing against reference file TAIR10_GFF3_genes.gff
You are using Cufflinks v2.0.2, which is the most recent release.
Warning: couldn't find fasta record for 'Chr1'!
Warning: couldn't find fasta record for 'Chr2'!
Warning: couldn't find fasta record for 'Chr3'!
Warning: couldn't find fasta record for 'Chr4'!
Warning: couldn't find fasta record for 'Chr5'!
Warning: couldn't find fasta record for 'ChrC'!
Warning: couldn't find fasta record for 'ChrM'!
shinigam123 is offline   Reply With Quote
Old 08-02-2017, 09:52 AM   #9
Richard Barker
Member
 
Location: Madison wisconsin

Join Date: Apr 2012
Posts: 47
Default

I used the pipeline that was made in the CyVerse Discovery environment. It's easy to use and really fast!
Richard Barker is offline   Reply With Quote
Old 08-02-2017, 10:02 AM   #10
shinigam123
Junior Member
 
Location: Mexico

Join Date: Aug 2017
Posts: 3
Default

Can you tell me what that pipeline is, do not I know it?
regards
shinigam123 is offline   Reply With Quote
Old 08-02-2017, 10:06 AM   #11
Richard Barker
Member
 
Location: Madison wisconsin

Join Date: Apr 2012
Posts: 47
Default

They have the HTprocess and Kalisto if you're in a rush
Richard Barker is offline   Reply With Quote
Old 08-02-2017, 10:22 AM   #12
shinigam123
Junior Member
 
Location: Mexico

Join Date: Aug 2017
Posts: 3
Default

But what was the problem, the inputs gff anda fasta? I need the output merged.gtf without warnings
shinigam123 is offline   Reply With Quote
Old 04-03-2019, 12:11 AM   #13
vivekkeshri
Junior Member
 
Location: China

Join Date: Jan 2019
Posts: 3
Default Cuffmerge output

I am trying to execute Cuffmerge (cuffmerge -p 5 -g Homo.gtf assemblies.txt), but unable to get FPKM values in output file ("merged.gtf).
Please let me know how to solve this problem.
__________________
vivekkeshri is offline   Reply With Quote
Old 07-29-2019, 05:49 AM   #14
vivekkeshri
Junior Member
 
Location: China

Join Date: Jan 2019
Posts: 3
Default

Please let me know about how "Cuffdiff -L" [-L/--labels: comma-separated list of condition labels] command works. How it is labeling / merging the bam files.
Thanks
__________________
vivekkeshri is offline   Reply With Quote
Reply

Tags
cuffmerge, fasta record mt, fasta record pt

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:34 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO