SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Pacific Biosciences



Similar Threads
Thread Thread Starter Forum Replies Last Post
ONT MinION mapper benchmarking dovah Oxford Nanopore 5 10-16-2016 06:48 PM
Testing/benchmarking RNASeq DE tools marcora Bioinformatics 21 02-19-2016 03:46 AM
Benchmarking RNA-seq DE-tools ErikFas Bioinformatics 11 06-21-2015 08:28 AM
Unique mapper on genome - multi mapper on transcriptome StephaniePi83 Bioinformatics 10 09-04-2012 12:49 AM
benchmarking BLAST szilva Bioinformatics 1 08-28-2009 08:36 AM

Reply
 
Thread Tools
Old 08-03-2016, 10:56 PM   #1
dovah
Member
 
Location: Russia

Join Date: Jul 2014
Posts: 18
Default RSII mapper benchmarking

Hi all,

I have RNA sequencing data (D. melanogaster) from 3 libraries (1-2Kb, 2-3Kb, 3-7Kb). What would be your suggestion for a mapper? I'm mostly interested in benchmarking isoform detection compared to illumina HiSeq2500. I'd like to do a benchmarking of the "established" tools (since I realized this knowledge is missing), but you can suggest new ones.

Thanks in advance for suggestions.
dovah is offline   Reply With Quote
Old 08-08-2016, 09:21 AM   #2
Magdoll
Member
 
Location: Bay Area

Join Date: Aug 2011
Posts: 30
Default

Do you mean how to map PacBio transcriptome (Iso-Seq) reads back to the reference genome or back to the reference transcript?

It also depends on what you have already done for the Iso-Seq data. If you have the results from running the classify + cluster pipeline of Iso-Seq, you will be getting something called "high-quality, quiver-polished, full-length sequences". These are expected to be at least >= 99% accurate. You can align them to the genome using GMAP/STAR or to the reference transcript using BLAST or BLASR.

If you have something called reads_of_insert.fasta (CCS reads) or isoseq_flnc.fasta (CCS reads, but specifically, full-length ones), they are of variable quality ranging from 85-99%+. They can be mapped with GMAP/STAR (but careful with low quality alignments, may need filtering) and BLAST/BLASR too (but again, careful with quality).


Please refer to this tutorial for details on how to use each aligner:
https://github.com/PacificBioscience...T%2C-and-BLASR

Additionally the wiki contains may useful information for analyzing Iso-Seq data downstream.

https://github.com/PacificBiosciences/cDNA_primer/wiki/

Also look into Iso-Aux for combining short read + long read data and doing comparisons:
https://github.com/bowhan/isoaux

--Liz
Magdoll is offline   Reply With Quote
Old 09-06-2016, 01:29 AM   #3
lingling huang
Member
 
Location: changsha

Join Date: Mar 2016
Posts: 46
Default

Quote:
Originally Posted by Magdoll View Post
Do you mean how to map PacBio transcriptome (Iso-Seq) reads back to the reference genome or back to the reference transcript?

It also depends on what you have already done for the Iso-Seq data. If you have the results from running the classify + cluster pipeline of Iso-Seq, you will be getting something called "high-quality, quiver-polished, full-length sequences". These are expected to be at least >= 99% accurate. You can align them to the genome using GMAP/STAR or to the reference transcript using BLAST or BLASR.

If you have something called reads_of_insert.fasta (CCS reads) or isoseq_flnc.fasta (CCS reads, but specifically, full-length ones), they are of variable quality ranging from 85-99%+. They can be mapped with GMAP/STAR (but careful with low quality alignments, may need filtering) and BLAST/BLASR too (but again, careful with quality).


Please refer to this tutorial for details on how to use each aligner:
https://github.com/PacificBioscience...T%2C-and-BLASR

Additionally the wiki contains may useful information for analyzing Iso-Seq data downstream.

https://github.com/PacificBiosciences/cDNA_primer/wiki/

Also look into Iso-Aux for combining short read + long read data and doing comparisons:
https://github.com/bowhan/isoaux

--Liz
How to filter out all GMAP alignments that have less than 90% of the read aligned or less than 80% identity, by using "filterBAM" ?
lingling huang is offline   Reply With Quote
Old 09-06-2016, 10:09 AM   #4
Magdoll
Member
 
Location: Bay Area

Join Date: Aug 2011
Posts: 30
Default

GMAP has two parameters to filter by coverage and identity. Do gmap --help for more details.

The two parameters are --min-trimmed-coverage and --min-identity.
Magdoll is offline   Reply With Quote
Reply

Tags
isoform, mapper, rna sequencing

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:54 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO