SEQanswers

Go Back   SEQanswers > Introductions



Similar Threads
Thread Thread Starter Forum Replies Last Post
Normalization for NGS count data with high variance between observations / uneven com nouse Metagenomics 0 08-24-2017 04:36 AM
NGS Data Normalization Genohub Literature Watch 0 06-29-2014 08:16 AM
using NGS, what is the best miRNA expression normalization method? Giorgio C Bioinformatics 4 12-07-2011 07:32 AM
ChIP-Seq: The Poisson Margin Test for Normalization-Free Significance Analysis of NGS Newsbot! Literature Watch 0 03-10-2011 03:00 AM
Titanium Runs pr0t3us 454 Pyrosequencing 6 06-24-2009 08:36 AM

Reply
 
Thread Tools
Old 09-25-2018, 01:15 PM   #1
genferreri
Junior Member
 
Location: Athens, GA

Join Date: Apr 2016
Posts: 2
Question Normalization from different NGS runs

Hello community, I would certainly appreciate some help here. Many thanks in advance.
I have been looking around about this subject and everything refers to differential expression of RNA-Seq which is a little different from what I am looking for. I am working with a virus and I would like to assess the depth of coverage produced by sequencing their genomes from two different sources (genomes are the same, the sources are different). Even though both sources are quite different the first step after RNA extraction is a OneStep RT-PCR. I mentione this because as you may guest already I usually get a pretty good coverage. The question here is wether those coverages are comparable.
I have 30 samples that were ran in different runs in a Mi-Seq platform. Let's say that half of the samples come from one source and that the other half from the other one. Since I want to compare them what I would like to do is to normalize the dataset. What I have in mind is to normalize by log2 instead of normalizing by total number of reads. Then calculate the Depth of coverage based on the number of reads normalized. Does it make sense? Should I go for DESeq or some other package? I think that at the end I will end up comparing the results using different approaches but I just would like some comments and suggestions.
Thanks once more.
genferreri is offline   Reply With Quote
Old 09-25-2018, 10:32 PM   #2
amhaan
Junior Member
 
Location: Minnesota

Join Date: Sep 2018
Posts: 1
Default

Hello genferreri, I am combining RNAseq data from three experiments generated by HiSeq and another Illumina platform to analyze gene expression patterns across different tissue types. I generated read counts with HTSeq by aligning previously mapped reads (can be in SAM or BAM format) to features (in my case features were genes, but they don't have to be) in a gtf file and normalized them using both EdgeR and DESeq2. If you have a gtf (or gff) file with features that you can align your reads to and generate counts, I don't see why you couldn't use EdgeR or DESeq2 to normalize.

I used the myDGEList function in EdgeR to collate count files generated by HTSeq for all samples:
MyDGEList <- readDGE(Count_files, path="./Count_Tables/",labels=sample_ids)

Then, before normalizing with EdgeR, I extracted counts from the MyDGEList to use with DESeq2:

Counts <- MyDGEList$Counts #make sure to remove last few rows containing metatags from HTSeq

Then I filtered and normalized for EdgeR using the following commands in R:

keep <- rowSums(cpm(MyDGEList)>1) >= 2 #filter out lowly expressed genes
MyDGEList <- MyDGEList[keep, , keep.lib.sizes=TRUE]
MyDGEList <- calcNormFactors(MyDGEList) #normalize

#Extract table of logCPM (log2 counts per million) by:
log_cpm <- cpm(MyDGEList, prior.count=0.25, log=TRUE)

For DESeq2, I did the following:

Count_Table <- DESeqDataSetFromMatrix(countData=Counts,colData=SS_Column_Info, design=~Tissue) #Make count table readable by DESeq

dds <- DESeq(Count_Table) #Make DESeq object

ddsClean <- replaceOutliersWithTrimmedMean(dds) #Remove outliers

dds <- DESeq(ddsClean) #New DESeq object after removing outliers

dds <- estimateSizeFactors(dds) #for normalization

#Two options for getting tables of transformed counts in DESeq2
vsd <- vst(dds) #Variance stabilizing transformation of counts
rld <- rlog(dds) #Regularized log transformation

I hope this helps!

Last edited by amhaan; 09-25-2018 at 10:37 PM. Reason: typo in function
amhaan is offline   Reply With Quote
Old 09-26-2018, 06:33 AM   #3
genferreri
Junior Member
 
Location: Athens, GA

Join Date: Apr 2016
Posts: 2
Thumbs up

Thank you very much amhaan. I will go over it and compare the different outcomes.
genferreri is offline   Reply With Quote
Reply

Tags
coverage, illumina, normalization, read count

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:36 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO