Unconfigured Ad

**danwiththeplan** · 01-24-2016, 08:07 PM

XLOC numbers are assigned to all genes, not just ones that aren't in GTF files you supply. So some genes will end up with a gene_name, but all genes end up with a gene_id (=XLOC number). I think the issue is that the downstream programs are looking for the gene_id field and not the gene_name field.

Incidentally, as I am interpreting it, using the -G switch (as opposed to the -g switch) means that you'll only ever detect/quantitate/analyse genes that are in your GTF file (while using -g means that cufflinks will create a new gene from scratch if there is sufficient read support, even if it's not in the GTF file you supply).

So, with the code you used, all output genes should have both a gene_id (XLOC number) and a gene_name (from the GTF)

**gtduarte** · 01-25-2016, 12:32 PM

Hello dan, thanks for replying. I tried what you suggested, running cufflinks with the -g option instead of -G, but unfortunately it didn't work:

$ cufflinks -o cuff_g -g ~/path_to/A_thaliana.TAIR10.30.gtf -b ~/path_to/A_thaliana.TAIR10.30.fa myfile_sorted.bam

Indeed the resulting transcripts.gtf was a bit different from the previous one, for instance:

-> with -G switch:

1 Cufflinks transcript 11649 13714 1000 - . gene_id "gene:AT1G01030"; transcript_id "transcript:AT1G01030.1"; FPKM "0.4354775887"; frac "1.000000"; conf_lo "0.217739"; conf_hi "0.653216"; cov "1.257353";

-> with -g switch:

1 Cufflinks transcript 11649 13714 1000 - . gene_id "AT1G01030"; transcript_id "AT1G01030.1"; FPKM "0.4330886861"; frac "1.000000"; conf_lo "0.216544"; conf_hi "0.649633"; cov "1.250649"; full_read_support "yes";

However, those bam_errors continue to appear when I run cuffmerge, as before, just as the XLOC values as my gene ids:

Example of the merged.gtf:

1 Cufflinks exon 3631 3913 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000002"; exon_number "1"; gene_name "NAC001"; oId "transcript:AT1G01010.1"; nearest_ref "transcript:AT1G01010.1"; class_code "="; tss_id "TSS1"; p_id "P1";

Nevertheless I run cuffdiff, and there were the XLOCs:

From gene_exp.diff:

test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
XLOC_000001 XLOC_000001 NAC001 1:3630-5899 wtmock1 wtaba1 OK 2.82389 5.42847 0.94286 0.938326 0.3329 0.999039 no

Just in case, I checked tophat accepted_hits.bam headers, but apparently it seems fine:

$ samtools view -H wtaba1_sorted.bam

@HD VN:1.0 SO:coordinate
@SQ SN:1 LN:30427671
@SQ SN:2 LN:19698289
@SQ SN:3 LN:23459830
@SQ SN:4 LN:18585056
@SQ SN:5 LN:26975502
@SQ SN:Mt LN:366924
@SQ SN:Pt LN:154478
@PG ID:TopHat VN:2.1.0 CL:/usr/bin/tophat -N 3 --read-edit-dist 4 --read-realign-edit-dist 0 -a 6 --microexon-search -r 150 --mate-std-dev 200 -i 8 -I 10000 --min-segment-intron 8 --max-segment-intron 10000 --b2-very-sensitive /path_to/bowtie_index/A_thaliana.TAIR10.30 myfile1_1_paired.fastq.trim myfile1_2_paired.fastq.trim

Do you have any other clue?

Many thanks again!

Topics	Statistics	Last Post
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, Yesterday, 05:37 AM	0 responses 7 views 0 reactions	Last Post by SEQadmin2 Yesterday, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 17 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 52 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 110 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM

Unconfigured Ad

Gene id changed to XLOC... cuffmerge issue?

Comment

Comment

Latest Articles

ad_right_rmr

News