Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cuffmerge not merging?

    Hi All,
    I have ran:
    tophat (w transcriptome index) -> cufflinks (-g & -M) -> cuffmerge -> cuffquant (-M) -> cuffnorm
    on a set of >300 samples. In the resulting counts table I have duplicated entries for many genes (thousands).

    For example:
    $ grep MORN1 ./cuffnorm_out/genes.attr_table
    XLOC_000065 - - XLOC_000065 MORN1 TSS126 1:2252451-2323157 -
    XLOC_000067 - - XLOC_000067 MORN1 TSS129 1:2252451-2323157 -
    XLOC_000068 - - XLOC_000068 MORN1 TSS130 1:2252451-2323157 -
    XLOC_000069 - - XLOC_000069 MORN1 TSS131 1:2252451-2323157 -
    XLOC_002153 - - XLOC_002153 MORN1,RP4-740C4.6,RP4-740C4.9 TSS5397,TSS5398,TSS5399,TSS5400 1:2252451-2323157 -
    $ grep RP11-206L10.9 ./cuffnorm_out/genes.attr_table
    XLOC_000013 - - XLOC_000013 RP11-206L10.9 TSS15 1:645707-762902 -
    XLOC_000014 - - XLOC_000014 RP11-206L10.9 TSS16,TSS17 1:645707-762902 -
    These genes has multiple transcripts with different transcription start sites, but several overlapping exons.

    These duplicated genes are also in the merged GTF resulting from the cuffmerge step. Should these not be merged, and do you have any idea of why they were not?

    I've used tophat v. 2.0.12 (bowtie2 v.2.2.3), cufflinks v.2.2.1 with hg19 and a gencode GTF (v.19). No warnings or errors regarding duplicated IDs along the way.

    Any ideas or comments very welcome.

    Thanks,
    Bo

Latest Articles

Collapse

  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM
  • seqadmin
    Techniques and Challenges in Conservation Genomics
    by seqadmin



    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

    Avian Conservation
    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
    03-08-2024, 10:41 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 06:37 PM
0 responses
10 views
0 likes
Last Post seqadmin  
Started by seqadmin, Yesterday, 06:07 PM
0 responses
9 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-22-2024, 10:03 AM
0 responses
49 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-21-2024, 07:32 AM
0 responses
67 views
0 likes
Last Post seqadmin  
Working...
X