SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Public RAW RNA-seq data Now What!! prussiap Bioinformatics 13 10-03-2012 12:26 AM
Need public (human) RNA-seq data: paired-end, 4+ replicates per group turnersd Bioinformatics 4 05-09-2012 06:50 AM
Is there any public raw intensity data from RNA-Seq endether Bioinformatics 0 10-10-2011 08:06 AM

Reply
 
Thread Tools
Old 02-04-2013, 11:32 AM   #1
turnersd
Senior Member
 
Location: Charlottesville, VA

Join Date: May 2011
Posts: 112
Default RNA-seq w/ combined mouse+human tissue: public data & methods?

I'm proposing to do some RNA-seq analysis of human cancer tissue transplanted into a mouse (mouse tissue stroma around the tumor), and look at expression in both human and mouse transcripts. I'm guessing I'll have around 90% human, 10% mouse tissue.

My primary question: is there any public data (GEO, SRA) that has samples like this? I'd like to use public data to assess feasibility and test a few strategies for mapping this data. Which leads to question #2:

Secondary question: what's the best way to map this data? I see a few options: (1) map all reads to human, remainder to mouse (or vice versa), (2) map all reads to human, then all reads to mouse, or (3) map all reads to concatenated reference index, eliminating multimappers. I'm thinking #3 would be best, but requires some extra legwork creating combined fasta files, combined indexes, then disentangling reads that align to human vs mouse in the downstream alignment.
turnersd is offline   Reply With Quote
Old 02-14-2013, 11:55 AM   #2
dstorey
Junior Member
 
Location: Tennessee

Join Date: Feb 2013
Posts: 5
Default

I can't help you with 1, but I have a similar project of mixed transcripts from different species.

While the genomes were no where near as large , we simply went with concatenating them together and doing the search I doubt you'll have any memory issues if you're on a modern machine (>8Gb RAM).

We used Bowtie and ramped up the number of time we re-seeded the search. (Well above the very-sensitive pre-sets that are recommended) We also made the decision not to separate the reads down stream until after we got normalized counts but I don't think separating multi-mapped reads out will be as difficult as you think.
dstorey is offline   Reply With Quote
Old 02-14-2013, 04:43 PM   #3
turnersd
Senior Member
 
Location: Charlottesville, VA

Join Date: May 2011
Posts: 112
Default

Thanks dstorey. I also recently came across this method for partitioning reads to their respective species before mapping. Might give it a try.

http://www.ncbi.nlm.nih.gov/pubmed/22689758
turnersd is offline   Reply With Quote
Old 02-14-2013, 06:45 PM   #4
dstorey
Junior Member
 
Location: Tennessee

Join Date: Feb 2013
Posts: 5
Default

Not to be a smart ass but will this tool increase mapping accuracy or decrease computational complexity in a meaningful way? Otherwise a simple grep would help you sort mapped reads based on where they mapped.
dstorey is offline   Reply With Quote
Old 02-18-2013, 01:32 AM   #5
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

I'd agree with the OP that his method #3 sounds best. We did this recently in another experiment involving mixed material from two species and it worked very well. And concatenating a few FASTA files is not reallt that much effort.

One caveat, though: Some aligners seem to have issues if the reference has more then 2^32 bp (i.e., 4 Gb), and human alone is already 3 Gb. We used GSNAP, because it can handle large references, at least since recently.
Simon Anders is offline   Reply With Quote
Old 02-20-2013, 09:58 PM   #6
Jeremy
Senior Member
 
Location: Pathum Thani, Thailand

Join Date: Nov 2009
Posts: 190
Default

I would also go with number 3. Most mappers I have used will map a read to the best fit that it finds meaning:
for 1) that whichever species you chose to map first would get reads from the other species mapping for conserved regions;
and for 2) there will be many cases where a read maps to both mouse and human and will require additional work to determine which is the best map
Jeremy is offline   Reply With Quote
Old 02-21-2013, 04:21 AM   #7
bioBob
Member
 
Location: Virginia

Join Date: Mar 2011
Posts: 72
Default

Hi Stephen.

We are about 1 month away from having this type of data for human-mouse and dog-human transplants so the post is timely for us. We were planning on going down your option 3 as well, but I do like the idea presented in the Xenome paper.
bioBob is offline   Reply With Quote
Old 02-21-2013, 04:24 AM   #8
turnersd
Senior Member
 
Location: Charlottesville, VA

Join Date: May 2011
Posts: 112
Default

Quote:
Originally Posted by bioBob View Post
Hi Stephen.

We are about 1 month away from having this type of data for human-mouse and dog-human transplants so the post is timely for us. We were planning on going down your option 3 as well, but I do like the idea presented in the Xenome paper.
Bob (is that you Settlage?) - I'd love to see what you think if you try both approaches.
turnersd is offline   Reply With Quote
Old 02-21-2013, 04:48 AM   #9
bioBob
Member
 
Location: Virginia

Join Date: Mar 2011
Posts: 72
Default

Yes, we can talk after we get some data. Hopefully before ABRF.
bioBob is offline   Reply With Quote
Old 03-19-2013, 12:34 PM   #10
scottyler89
Junior Member
 
Location: Iowa

Join Date: Nov 2012
Posts: 6
Default

I realize that I'm probably coming into this conversation a bit late. But - If it were me - I'd probably try to separate them biologically before sequencing. I would use something like SCID mice with ROSA-tomato, or another reporter, so that you could use RNA-later and separate by flow.

I agree with everyone here that approach #3 would work best, but there could be interesting data within those genes. Just a thought - if you hadn't yet started and had access to reporter SCID mice.
scottyler89 is offline   Reply With Quote
Reply

Tags
geo, rna-seq, rnaseq, sra, tophat

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:22 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO