SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Difference between Cufflinks coverage and HTSeq-count JonB Bioinformatics 0 05-28-2015 06:15 AM
What does IGV track color alignments by strand when using non-stranded RNA-seq? Villy Bioinformatics 3 03-19-2015 11:49 PM
Get igv coverage track obk Bioinformatics 24 04-22-2014 11:23 PM
Making nice image of chr1 coverage track in IGV sindrle Bioinformatics 8 03-18-2014 04:29 AM
htseq-count output Palgrave Bioinformatics 7 03-05-2012 07:04 AM

Reply
 
Thread Tools
Old 02-29-2016, 11:50 AM   #1
ScientistShan
Junior Member
 
Location: Los Angeles

Join Date: Feb 2016
Posts: 3
Default Single-cell RNA-seq coverage track not corresponding to htseq-count output or IGV

I have some single-cell RNA-seq data that I've processed via a `tophat2` --> `htseq-count` pipeline, and I've also generated some bed graphs via `bedtools genomecov`. I have stumbled upon two issues that I don't seem to understand.

**Issue 1**

I've noticed that the `htseq-count` and bedgraphs for this data is not necessarily in agreement. For example, we see peaks on IGV with cells that have 0 counts according to htseq. An example of this can be seen in cells H11 and G11. According to `htseq-count`, H11 has 0 hits on Xist while G11 has 320. If we look at the IGV tracks for these cells we see peaks on both tracks despite the 0 in H11 (please see attached screenshot below; tracks are auto-scaled). My reasoning for this is because htseq-count does not account for multi mapped reads and these may be the read showing up on IGV. Also the bed graphs are not normalized/scaled. Does anyone have other insight? In my pipeline I've treated all cells as non-strand-specific (tophat2 argument `--library-type fr-unstranded` and htseq-count argument `--stranded=no`), which I believe is correct given the library prep conditions. The pipeline is briefed below. I should mention that no errors or warning were thrown thought the pipeline. Please let me know if anything strikes you as incorrect.

Pipeline

IGV screenshot 1

**Issue 2**

So in my RNA-seq data sometimes I see reads that do not correspond to an annotated gene/region, so let's say these are "New genes".

Would I see such regions in other published data, or did they somehow only provide coordinates for regions that are annotated? Because, just looking in IGV, I see no such regions in other published data, so I'm wondering if it could be because of the way they deposited their data.

Another thought is that these 'unannotated' regions are actually annotated but using USCS or other gene tracks.

Please refer to the screen shot posted below.

IGV screenshot 2

Last edited by ScientistShan; 02-29-2016 at 11:52 AM. Reason: Fixed URL links.
ScientistShan is offline   Reply With Quote
Old 03-01-2016, 07:03 AM   #2
ScientistShan
Junior Member
 
Location: Los Angeles

Join Date: Feb 2016
Posts: 3
Default

Bump for help!
ScientistShan is offline   Reply With Quote
Old 03-01-2016, 07:07 AM   #3
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,490
Default

For reference: This was cross-posted https://www.biostars.org/p/179193/ and did get an answer.

Are you not satisfied with that answer?
GenoMax is offline   Reply With Quote
Reply

Tags
htseq-count, igv, rna-seq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:26 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO