SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Mapping Human RNA Seq: Transcriptome vs. Genome JueFish Bioinformatics 9 01-17-2018 09:47 AM
Mapping to a transcriptome mikecz Bioinformatics 16 12-23-2012 04:21 AM
Bowtie2 with Tophat plassaaw Bioinformatics 4 11-11-2011 06:12 AM
mapping transcriptome reads to assembled ESTs MadraghRua Bioinformatics 0 06-28-2011 08:54 AM
transcriptome mapping with bowtie -- only 22% reads hit IrisZhu Bioinformatics 21 03-05-2011 03:04 PM

Reply
 
Thread Tools
Old 12-22-2011, 02:46 AM   #1
Neuromancer
Member
 
Location: Goettingen, Germany

Join Date: Aug 2011
Posts: 28
Default Mapping to transcriptome with Bowtie2-beta5

Hey guys,
I wanted to map by brand new paired-end RNA-seq data to the mouse transcriptome using the current beta (b5) of Bowtie2.
As I could not find any pre-build index for this, I build it myself using bowtie2-build to make an index of ensemble transcript information.
The three mouse-fasta-files for this were downloaded from the ensemble ftp site . I wanted to get as much information as possible so I included cDNA-all, cDNA-abinitio and ncRNA fasta files for indexing.
Then, I mapped the paired-end RNA-seq data to this index using the following command:
Quote:
./bowtie2 -p 4 -t --local -x mouse_transcriptome_ensembl-NCBI37_ncRNA_cDNAall_abinitiopredictons -1 <matepair1.fastq> -2 <matepair2.fastq> -S output.sam
So far so good, it all worked well with overall alignment rates of 80-90%.
Now, when I want to import the data to SeqMonk, after reading all the lines it tells me that it "Couldn't extract valid name for <ensemble-tanscript-ID/Genscen-ID>" and leaves me with no reads at all... This is probably because there is no chromosome information or not in the expected position?

From what I could find out, the ensemble-fasta-files also contain some "supercontigs" that do not have chromosome information but an NT-xxxx ID.
But still, then there should be reads with the correct annotation, right?

So what went wrong with my workflow here, and can I still rescue the SAM-files that I now produced?

btw: the SAM file header looks like this:
Quote:
@HD VN:1.0 SO:unsorted
@SQ SN:GENSCAN00000015589 LN:298
@SQ SN:GENSCAN00000001573 LN:74
@SQ SN:GENSCAN00000001572 LN:260
@SQ SN:GENSCAN00000026402 LN:489
...
@SQ SN:ENSMUST00000146092 LN:216
@SQ SN:ENSMUST00000120435 LN:630
@SQ SN:ENSMUST00000118023 LN:1647
...almost endlessly...
and then the alignment comes, which looks like this:
Quote:
HWI-ST933:54:C01BFACXX:3:1101:10433:5230 99 ENSMUST00000082408 34 99M = 76 169 CGAAAATCTATTTGCCTCATTCATTACCCCAACAATAATAGGATTCCCAATCGTTGTAGCCATCATTATATTTCCTTCAATCCTATTCCCATCCTCAAA CCCFFFFFHHHHHJJJJJJJJJJJJJJJJJJJJJIJJJJJIJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJAHHHHHFFFFFFEEEDEEDDDDDD AS:i:198 XS:i:98 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:99YS:i:198 YT:Z:CP
Neuromancer is offline   Reply With Quote
Reply

Tags
bowtie2, sam file, seqmonk, transcriptome

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:52 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO