Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Genome Res De novo bacterial genome sequencing: millions of very short reads assembly b_seite Literature Watch 1 10-05-2017 12:26 AM
Genome Assembly P-Richmond Bioinformatics 2 08-07-2012 12:41 AM
Targeted Genome Assembly for region poorly represented in reference genome? gumbos Bioinformatics 1 01-09-2012 05:01 PM
question about RNA-seq in hybridized animals boyzoe Bioinformatics 2 07-10-2010 10:36 PM
customize SAM/BAM format plichel Bioinformatics 1 03-22-2010 07:39 PM

Thread Tools
Old 08-08-2012, 02:51 PM   #1
Junior Member
Location: Northeast USA

Join Date: Aug 2012
Posts: 4
Default How to customize genome assembly for transgenic animals

I've been working with human RNAseq data (Illumina single-end) with some success with standard use of the Tuxedo suite using the hg19 assembly as a guide. But now we are trying to work with paired-end Illumina data from some transgenic mice. There are several transgenes in the genomes of the mice (EGFP, Cre etc) and we don't want to lose the expression information from these transgenes however of course they are not in the mm9 or mm10 genome assemblies. What is the easiest way to maintain these genes in our analysis?

Thanks for any suggestions!
miRman is offline   Reply With Quote
Old 08-09-2012, 12:26 AM   #2
Senior Member
Location: San Francisco, CA

Join Date: Feb 2011
Posts: 286

That sounds like a rather unique thing you're trying to do. I can see two possibilities:

1) Use programs that will estimate expressional values from only a reference transcriptome. I believe trans-ABySS and Trinity can give you some guide on how to do this. So, what I would do is simply add your transgenes to the reference cDNAs, then align the reads to this adjusted transcriptome, and use the various programs to estimate abundancies.

2) Create the new genome including the transgenes. It might be a bit of a pain, but if you only have 2 or so genes and you know where they are inserted, you could rather manually adjust the reference genome and gene annotation to reflect that. This would probably be harder than option 1, but it might be better than option 1 too. If you care that much about your transgenes, you might just have to do a lanes or so of DNA sequencing to make sure you got it right.

I'd be interested to see if anyone has actually published a methodology for doing what you're trying, because I sure haven't noticed it.
Wallysb01 is offline   Reply With Quote
Old 08-09-2012, 09:04 AM   #3
Senior Member
Location: Woodbridge CT

Join Date: Oct 2008
Posts: 231
Default Furthermore...

Transgenic organisms are important biotechnology products in addition to being tools for ongoing research in diverse areas of biology. Perhaps when you work out some methods and collect a bit more information, this can become a topic in the SeqWiki and a helpful guide to many others. Please contribute your findings.
Joann is offline   Reply With Quote
Old 11-08-2012, 03:06 PM   #4
Junior Member
Location: Northeast USA

Join Date: Aug 2012
Posts: 4

For the benefit of anybody else going down this road: I didn't pursue this very much. However, talking to various people, I got a few suggestions to make a new FASTA reference file containing the custom genes. For TopHat and Cufflinks, prepare a custom GTF (start with the known genes from your genome) and add in the sequence id(s) taken from the custom FASTA file.

As I said, I didn't try this but thought I'd put it out here in case anybody finds it useful, or has any other comments.
miRman is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 10:00 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO