SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
ENCODE ChIP-seq Metrics (NSC,RSC) apredeus Bioinformatics 4 10-05-2016 01:42 AM
RNA-Seq: Comparative Analysis of RNA-Seq Alignment Algorithms and the RNA-Seq Unified Newsbot! Literature Watch 3 07-31-2011 07:08 PM

Reply
 
Thread Tools
Old 07-03-2015, 02:36 AM   #1
dan
wiki wiki
 
Location: Cambridge, England

Join Date: Jul 2008
Posts: 266
Lightbulb FASTQ alignment metrics (RNA-Seq)?

Hello,

How do people judge the quality of a FASTQ (short read) alignment? In particular I'm interested in evaluating RNA-Seq alignments, typically (but not exclusively) from ILLUMINA instruments.

What comes to mind is:
* Fraction of reads mapped
* Fraction of reads mapped uniquely
* Fraction of 'good' pairs (right orientation, right distance)

and for RNA-Seq specifically
* Fraction of reads mapping within a gene

Anything based on read mapping quality?

What other metrics can we think of?
__________________
Homepage: Dan Bolser
MetaBase the database of biological databases.
dan is offline   Reply With Quote
Old 07-03-2015, 02:56 AM   #2
annaprotasio
Junior Member
 
Location: UK

Join Date: Feb 2008
Posts: 6
Default

hi Dan,

Have a look at "samtools flagstat"

The output will looks something like this and I think it contains all the info you requested.

Code:
7276199 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
7276199 + 0 mapped (100.00%:-nan%)
7276199 + 0 paired in sequencing
3787000 + 0 read1
3489199 + 0 read2
6195536 + 0 properly paired (85.15%:-nan%)
6795026 + 0 with itself and mate mapped
481173 + 0 singletons (6.61%:-nan%)
480036 + 0 with mate mapped to a different chr
480036 + 0 with mate mapped to a different chr (mapQ>=5)
good luck
annaprotasio is offline   Reply With Quote
Old 07-03-2015, 03:37 AM   #3
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,054
Default

Also take a look at RSeQC: http://rseqc.sourceforge.net/

Most aligners will produce stats on alignments e.g. BBMap, TopHat and probably STAR as well.
GenoMax is offline   Reply With Quote
Old 07-03-2015, 05:25 AM   #4
maxsalm
Member
 
Location: London

Join Date: Feb 2015
Posts: 18
Default

FastQC may also be of general use: http://www.bioinformatics.babraham.a...ojects/fastqc/
maxsalm is offline   Reply With Quote
Old 07-03-2015, 07:48 AM   #5
dan
wiki wiki
 
Location: Cambridge, England

Join Date: Jul 2008
Posts: 266
Default

Quote:
Originally Posted by maxsalm View Post
I agree it's useful, but it's not what I want here.
__________________
Homepage: Dan Bolser
MetaBase the database of biological databases.
dan is offline   Reply With Quote
Old 07-03-2015, 11:49 AM   #6
jwfoley
Senior Member
 
Location: Stanford

Join Date: Jun 2009
Posts: 181
Default

How about proportion of duplicate fragments? This will depend on whether you've done single- or paired-end reads, though, since with single RNA-seq reads you do expect a certain amount of duplication by chance (with paired reads it's a much smaller chance).
jwfoley is offline   Reply With Quote
Old 07-05-2015, 02:53 PM   #7
bjackson
Junior Member
 
Location: Denver, CO

Join Date: May 2015
Posts: 6
Default

I do primarily single ended reads, but for alignment quality I look primarily at
1) pct of reads mapped
2) pct of reads uniquely mapped

It sounds like you are also asking about post-alignment qc in general and I add
3) read duplication (ie how many reads align to identical location) - most reads should have only one or several.
4) reads biotype distribution (most should map to protein-coding regions)
5) cumulative pct measures - I sort genes by count or fpkm and graph # of genes vs cumulative percentage. That will tell you if you are sinking a lot of reads into very common transcripts and tell you that you might need more depth to see certain less common transcripts.
bjackson is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:45 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO