Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • gen2prot
    Member
    • Apr 2010
    • 68

    Cufflinks with the -G option

    Hello All,

    I have a drosophila GTF file which I created, since I wanted to use the latest release from flyable and the UCSC annotated GTF is from 2006. Anyway, here is what it looks like…

    Code:
    3R	FlyBase	exon	24574105	24575330	.	-	.	gene_id "FBgn0039596"; gene_name "CG10000"; exon_number "7"; transcript_id "FBtr0085315"; parent_type=mRNA;
    3R	FlyBase	exon	24575574	24575753	.	-	.	gene_id "FBgn0039596"; gene_name "CG10000"; exon_number "6"; transcript_id "FBtr0085315"; parent_type=mRNA;
    3R	FlyBase	exon	24575893	24576062	.	-	.	gene_id "FBgn0039596"; gene_name "CG10000"; exon_number "5"; transcript_id "FBtr0085315"; parent_type=mRNA;
    3R	FlyBase	exon	24576188	24576651	.	-	.	gene_id "FBgn0039596"; gene_name "CG10000"; exon_number "4"; transcript_id "FBtr0085315"; parent_type=mRNA;
    3R	FlyBase	exon	24576707	24576885	.	-	.	gene_id "FBgn0039596"; gene_name "CG10000"; exon_number "3"; transcript_id "FBtr0085315"; parent_type=mRNA;
    3R	FlyBase	exon	24576947	24577107	.	-	.	gene_id "FBgn0039596"; gene_name "CG10000"; exon_number "2"; transcript_id "FBtr0085315"; parent_type=mRNA;
    3R	FlyBase	exon	24577166	24577313	.	-	.	gene_id "FBgn0039596"; gene_name "CG10000"; exon_number "1"; transcript_id "FBtr0085315"; parent_type=mRNA;
    3R	FlyBase	exon	24562831	24563368	.	-	.	gene_id "FBgn0039595"; gene_name "AR-2"; exon_number "4"; transcript_id "FBtr0085316"; parent_type=mRNA;
    3R	FlyBase	exon	24566194	24566352	.	-	.	gene_id "FBgn0039595"; gene_name "AR-2"; exon_number "3"; transcript_id "FBtr0085316"; parent_type=mRNA;
    3R	FlyBase	exon	24566428	24566706	.	-	.	gene_id "FBgn0039595"; gene_name "AR-2"; exon_number "2"; transcript_id "FBtr0085316"; parent_type=mRNA;
    After running the latest version of Tophat with the one-directional RNA-Seq reads, I get a .bam file. Using this my input to cufflinks was…

    Code:
    cufflinks -p 4 -G ALLEXONS.gtf -o ./outputdir accepted_hits.bam
    After running this using LSF command 'bsub', I get an error output file, the start of which looks like this

    Code:
    GFF warning: merging adjacent/overlapping segments of FBtr0084817 on 3R (21094383-21094697, 21094700-21095435)
    [20:04:14] Inspecting reads and determining fragment length distribution.
    > Processing Locus 2L:21918-25151 [ ] 0%
    > Processing Locus 2L:76445-77211 [ ] 0%
    > Processing Locus 2L:102381-106718 [ ] 0%
    I have three output files, the smallest of which is the genes.expr file and the largest is the transcripts.gtf file.
    Questions:

    1) How will I know if cufflinks has accepted the GTF file that I created correctly or not? I used a perl script from the internet to check the validity of my GTF file. Its output says that barring the non-availability of CDS coordinates the GTF file is ok. Do I need to change the positions of some fields? or add CDS coordinates to the file.

    2) Why am I getting such a huge error file? I see that in the UCSD GTF file the chromosome names begin with chr. Do I need to change the annotation to chr3L, chr3R, chr2L, chr2R and so on?

    3) When I supply the reference genome, to bowtie-build, I send a single fasta file containing the chromosome names and their sequences. It looks like this.

    Code:
    >Y
    AGCTAGCT
    >2L
    GCTGCTGCAGTC
    >2R
    CGATGATGA
    Do I need to break this up into separate chromosomes/files and build separate indexes? Sounds a bit illogical but I thought I'd ask.

    4) I get FPKM values of zero for some genes in the genes.expr file. Do I interpret that as too low expression to call? or is the program not running proper.

    Thank you for taking the time to read such a long post. I am relatively new to the cufflinks program, and GTF so any inputs and suggestions would be much appreciated.

    Regards
    Abhijit

Latest Articles

Collapse

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by SEQadmin2, 06-05-2026, 10:09 AM
0 responses
15 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-04-2026, 08:59 AM
0 responses
33 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-02-2026, 12:03 PM
0 responses
35 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-02-2026, 11:40 AM
0 responses
23 views
0 reactions
Last Post SEQadmin2  
Working...