SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
DEXSeq Using Counts File From htseq-count FuzzyCoder Bioinformatics 20 01-03-2016 11:18 PM
htseq-count : 0 counts per miRNA JPerales RNA Sequencing 4 06-04-2013 06:00 AM
Discrepancy between HTSeq-count counts and total mapped reads M_staats Bioinformatics 1 03-21-2013 06:16 AM
Problem using HTSeq count with SAM file without quality score flashton Bioinformatics 2 04-11-2012 03:29 AM
htseq-count with warning for every read to represent all of zero counts in output hibachings2013 RNA Sequencing 10 07-15-2011 10:19 AM

Reply
 
Thread Tools
Old 08-02-2013, 07:35 AM   #1
alan_sm
Junior Member
 
Location: USA

Join Date: Sep 2012
Posts: 3
Post Read counts from SAM file mapped to de novo assembled transcripts using HTSeq-count

Hello,

Very sorry for cross posting it on other blog site but I'm under pressure to sort this out.

I tried using HTSeq-count to extract read counts per transcript from the SAM file (generated using Bowtie2 and only uniquely aligned reads were considered) mapped to de novo assembled transcripts (for DE analysis). I made GTF file for the assembled transcripts FASTA file with a Perl script. Here are few lines of my GTF file.

Locus_47_Transcript_16/31_Confidence_0.158_Length_1485 AssembledTranscriptome exon 1 1485 . + . gene_id "AssemTrans1"; transcript_id "Locus_47_Transcript_16/31_Confidence_0.158_Length_1485";

Locus_58_Transcript_85/85_Confidence_0.017_Length_650 AssembledTranscriptome exon 1 650 . + . gene_id "AssemTrans1"; transcript_id "Locus_58_Transcript_85/85_Confidence_0.017_Length_650";

Transcript start is by default 1 and end is the length of the transcript and Strand is + for all.

It looks like it works great but I'm not sure if this is the right way to do it. Don't know if I have to worry about what Simon Anders as mentioned "If you must align against the transcriptome, make sure that you count for genes, not transcripts, and remove reads mapping to transcripts from more than one gene."

Any thoughts/comments/suggestions are much appreciated.

Thanks,
Alan
alan_sm is offline   Reply With Quote
Old 08-02-2013, 08:57 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,475
Default

Simon's comment is very much something to keep in mind, though the importance of that and how feasible the normal pipeline are will depend on the particulars of your experiment. As a biologist, it's easiest for me to understand changes at the gene level. Changes at the transcript level are undoubtedly important, but I honestly don't think that we have a good handle on how to really interpret the meaningfulness of most of these. Perhaps in whatever organism you work things are different, but without further details I don't think anyone could really say.
dpryan is offline   Reply With Quote
Old 06-12-2015, 08:54 PM   #3
kurban910
Member
 
Location: urumqi

Join Date: Jul 2014
Posts: 58
Default

hello @alan_sm
you said you made GTF file for the assembled transcripts FASTA file with a Perl script. could you please share the script?
once i really needed to make GTF file myself but could not, because i am not familiar with the perl scripting. if you will here is my email : [email protected]
thanks
kurban910 is offline   Reply With Quote
Reply

Tags
de novo assembly, htseq-count, readcount, sam

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:59 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO