SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
TF Motif track UCSC Livi81 General 0 06-15-2011 03:10 PM
Galaxy Track Browser Question hammy Bioinformatics 2 03-24-2011 06:22 AM
keeping track of versions mgogol Bioinformatics 1 02-08-2011 02:33 PM
exome sequencing track with agilent SperSeq Genomic Resequencing 3 11-09-2010 02:33 PM

Reply
 
Thread Tools
Old 05-18-2011, 04:14 AM   #1
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default How to create GC% track for IGV?

Anyone know a program or method to create a GC% track for IGV?

I want to convert a bacterial genome sequence into a %GC track to compare depth of coverage with %GC. If there were some code to produce a (for example) wiggle format file from a genome sequence, I could easily display it. Seems like such a program, or at least a module, will already have been written, but I am not finding it.

More details:

I have a couple of Deinococcus radiodurans (~70% GC bacterial genome) TruSeq library data sets sequenced 100x2 PE and aligned to the reference sequence. The average coverage is around 100X, but it is variable and in places very low. I would like to see if the areas of low coverage correlate with the higher GC% areas.

Here is an example window:


Thanks,
--
Phillip
pmiguel is offline   Reply With Quote
Old 05-18-2011, 05:10 AM   #2
Dethecor
Member
 
Location: Germany

Join Date: May 2010
Posts: 24
Default Windowsize or binary?

Hi Phillip,

you should specify more what you want to see. %GC per position would be a track that just tells you for each position whether it is a G or C or not.
But you probably want to compute the %GC in a window around each position or in a set of bins into which you partition your genome.

Either can be achieved for example by using HTSeq to read in your genome, compute the binned or windowed gc-percentage and then write that to a wiggle file (which basically means just writing all the values into a plain text file one value per line and some header line telling the name of the track etc.)

btw.: You might want to check out mappability also to see whether those regions are not mappable due to non-uniqueness of the reads originating there. (e.g. repetitive regions)

Cheers,
Paul
__________________

"You are only young once, but you can stay immature indefinitely."
Dethecor is offline   Reply With Quote
Old 05-18-2011, 06:40 AM   #3
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Hi Paul,
Yes I did mean a windowed %GC.

I will look at HTSeq. I am also interested in mappability. Will HTSeq produce a wiggle plot of that as well? This would be a plot of the number of repetitions of a given window of sequence (read length sized?) in the genome. It would have to allow a certain number of mismatches to usefully estimate the mappability of a given area.

Thanks,
Phillip
pmiguel is offline   Reply With Quote
Old 05-18-2011, 06:50 AM   #4
Dethecor
Member
 
Location: Germany

Join Date: May 2010
Posts: 24
Default Mappability

Hi Phillip,

this can be done with HTSeq, but it requires some expertise in programming python (or some time to learn some python).

For mappability I like to generate reads of the same length as my library based on the reference and then align them with the same tool used for the reads from my sample. This way the estimate of mappability best reflects what happened to the reads from the sample. (I usually don't give them qualities but if your aligner worked with a quality instead of a mismatch-count cut-off and you already determined the quality distribution of your reads, then simulating that could be helpful as well)

Then you can check which reads could be mapped (uniquely, if you want to apply such restrictions) and directly get the mappability of each position based on whether or not the read originating there was mapped or not. Followed by some binning or sliding window approach you can get a nice estimate of the mappability.

Cheers,
Paul
__________________

"You are only young once, but you can stay immature indefinitely."
Dethecor is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:00 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO