SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
can small rnaseq data be analyzed like rnaseq data? PFS Bioinformatics 5 05-02-2017 08:16 AM
comparison of Ion Proton RNAseq data with Illumina pair end data? vipul jain General 3 01-25-2015 11:11 AM
gender check using sequencing data kjaja Bioinformatics 10 02-12-2014 08:55 PM
[NGS - analysis of gene expression data] Machine Learning + RNAseq data Chuckytah Bioinformatics 7 03-05-2012 03:16 AM

Reply
 
Thread Tools
Old 10-17-2017, 11:40 AM   #1
sdarko
Member
 
Location: Bethesda, MD

Join Date: Apr 2009
Posts: 51
Default Determining gender of donors from RNASeq data

Hi all,

I have, what I think, is an interesting question/problem.

We did some single-cell RNASeq (SMARTer prep, so whole transcripts not 3' tagging) on a group of samples. In healthy controls this population of T-cells is pretty darned rare so that in order to have enough cells to sort we combined HC donors in some cases. Luckily, in the "mixed" HC samples, one donor was male and the other female. The two donors in any given mixed sample are unrelated. To further complicate matters, any "mixed" population could be anywhere from 100% donor1 to 100% donor2.

Now I've been asked to determine which single cell belonged to whom. My initial thought was that I could distinguish them based on the frequency of reads that align to the Y chromosome. Looking at the attached screenshot, that seems subpar.

I've looked into HLA identification from RNASeq reads and identifying SNPs, but neither has been particularly fruitful.

Does anyone have any suggestions?
Attached Images
File Type: png Screen Shot 2017-10-17 at 3.36.33 PM.png (58.6 KB, 14 views)
sdarko is offline   Reply With Quote
Old 10-18-2017, 02:01 AM   #2
wdecoster
Member
 
Location: Antwerp, Belgium

Join Date: Oct 2015
Posts: 97
Default

I've done something similar, using a few genes which are gender specific, rather than only Y chromosome genes. Note that you should think about the pseudoautosomal regions.

I right now don't have the time to go in depth on this, but I hope the code in the following script can point you in the right direction: https://github.com/wdecoster/DEA.R/b...DEA/DEA.R#L198
wdecoster is offline   Reply With Quote
Old 10-18-2017, 12:35 PM   #3
sdarko
Member
 
Location: Bethesda, MD

Join Date: Apr 2009
Posts: 51
Default

Thanks for the tip.

I grabbed the genes from your link, subsetted my expression matrix, trained a random forest classifier using the gene set and known genders (with greater then 99% out-of-bag accuracy) and then used that model to assign the unknown genders.

I think it worked well.

Last edited by sdarko; 10-18-2017 at 01:10 PM.
sdarko is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:33 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO