Seqanswers Leaderboard Ad

**Michael.Ante** · 08-23-2016, 12:14 AM

Hi Sebastian,

You should have a look at the rRNA content. Download just the sheep's rRNA sequences from NCBI as fasta, create a bowtie2 index, and map the reads with bowtie2 against them. The mapping rate is equal to the rRNA content in your sample. The remaining reads can be used for your STAR mapping.

Cheers,

Michael

**shi** · 08-23-2016, 03:18 PM

What featureCounts does is it counts the number of reads overlapping with features included in the annotation. Reads that were reported as mapped by aligner but not overlapping any features will not be counted by featureCounts. For an RNA-seq dataset, typically you will see around 30-50 percent of reads that were mapped but not counted.

featureCounts outputs a counting summary file (called "GvS_counts_full_ensembl.txt.summary" for your data), which should be helpful for you to understand why some reads were not counted.

**Sebastian_Quezada_R** · 08-23-2016, 04:55 PM

Originally posted by Michael.Ante View Post

Hi Sebastian,

You should have a look at the rRNA content. Download just the sheep's rRNA sequences from NCBI as fasta, create a bowtie2 index, and map the reads with bowtie2 against them. The mapping rate is equal to the rRNA content in your sample. The remaining reads can be used for your STAR mapping.

Cheers,

Michael

Thanks Michael, I'll try that and see what I can get.

Originally posted by shi View Post

What featureCounts does is it counts the number of reads overlapping with features included in the annotation. Reads that were reported as mapped by aligner but not overlapping any features will not be counted by featureCounts. For an RNA-seq dataset, typically you will see around 30-50 percent of reads that were mapped but not counted.

featureCounts outputs a counting summary file (called "GvS_counts_full_ensembl.txt.summary" for your data), which should be helpful for you to understand why some reads were not counted.

I just checked that file and it tells me the following stats:

Status v |Sample > 1251.1G 1251.1S 1251.2G 1251.2S 658.1G 658.1S
1 Assigned 21298779 19203813 23927215 20277649 21917507 27799505
2 Unassigned_Ambiguity 231677 212367 246618 210001 292053 256487
3 Unassigned_MultiMapping 146004332 162822044 159017518 147923500 116264627 154408557
4 Unassigned_NoFeatures 33888112 33895039 35942035 29565235 24630110 39141330
5 Unassigned_Unmapped 6568710 4145532 5041164 5343282 37451751 4849700
6 Unassigned_MappingQuality 0 0 0 0 0 0
7 Unassigned_FragmentLength 0 0 0 0 0 0
8 Unassigned_Chimera 0 0 0 0 0 0
9 Unassigned_Secondary 0 0 0 0 0 0
10 Unassigned_Nonjunction 0 0 0 0 0 0
11 Unassigned_Duplicate 0 0 0 0 0 0

So most of my reads were assigned as multimappers, whereas STAR did not count them as such =/.

Thanks a lot for your help.

Do you think the de novo assembly pipeline would be useful to go around this problem as a backup plan?

**shi** · 08-25-2016, 03:30 PM

So most of my reads were assigned as multimappers, whereas STAR did not count them as such =/.

featureCounts uses 'NH' tag to identify multi-mapping reads. So you have a lot of reads with this tag having a value greater than 1 in your mapping result.

**Sebastian_Quezada_R** · 08-25-2016, 06:03 PM

Originally posted by shi View Post

featureCounts uses 'NH' tag to identify multi-mapping reads. So you have a lot of reads with this tag having a value greater than 1 in your mapping result.

Thanks, Shi. I finally got it properly. I'm going to start another thread regarding the issues I'm having with my data, so I can see if I can get some insight that my bioinformatician colleagues might have missed.

Thanks again =)

**Michael.Ante** · 08-25-2016, 10:41 PM

Hi Sebastian,
the latest versions of STAR report the NH field. In older versions you should be able to report the number of hits with an according parameters setting.

Topics	Statistics	Last Post
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, Today, 06:57 AM	0 responses 9 views 0 likes	Last Post by seqadmin Today, 06:57 AM
Enhanced Neoantigen Detection: Introducing NeoHunter by seqadmin Started by seqadmin, Yesterday, 07:17 AM	0 responses 14 views 0 likes	Last Post by seqadmin Yesterday, 07:17 AM
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 19 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 23 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM

Seqanswers Leaderboard Ad

Announcement

Inconsistent results between mapping and read summarization

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News