Unconfigured Ad

**GenoMax** · 04-27-2018, 07:48 AM

You made a "new" reference by appending the fasta ERCC sequences to end of human genome and then created the STAR indexes from this hybrid file?

**alekzs** · 04-27-2018, 09:07 AM

Originally posted by GenoMax View Post

You made a "new" reference by appending the fasta ERCC sequences to end of human genome and then created the STAR indexes from this hybrid file?

Yes, I added both FASTA and GTF annotations und used the hybrid!

**GenoMax** · 04-27-2018, 09:12 AM

Then I am inclined to speculate that someone forgot to spike the ERCC aliquots. Unless alignments are not being reported since they fail STAR's multi-mapping threshold. Look into that as well.

Did you make the libraries (and add ERCC)?

**alekzs** · 04-27-2018, 09:25 AM

Originally posted by GenoMax View Post

Then I am inclined to speculate that someone forgot to spike the ERCC aliquots. Unless alignments are not being reported since they fail STAR's multi-mapping threshold. Look into that as well.

Did you make the libraries (and add ERCC)?

I did everything myself so chances are 50-50 I guess.

Anyhow, even if I didn't add the spike ins, shouldn't the gene names from the reference appear in a gene count file? Like, normal genes get 0 alignments/counts but they're still in the list, right?

**GenoMax** · 04-27-2018, 10:44 AM

When you added them to the GTF file they were in the correct format?

Are you able to see alignments for them in the BAM file?

**alekzs** · 04-27-2018, 11:30 AM

Originally posted by GenoMax View Post

When you added them to the GTF file they were in the correct format?

Are you able to see alignments for them in the BAM file?

Code:

Tail of FASTA file:
>ERCC-00171 DQ854994 Ac03459967_a1 Ac03460063_a1
CTGGAGATTGTCTCGTACGGTTAAGAGCCTCCGCCCGTCTCTGGGACTATGGACGGGCACGCTCATATCAGGCTATATTTGGTCCGGGTTATTATCGTCGCGGTTACCGTAATACTTCAGATCAGTTAAGTAGGGCCATATGCCTCGGGAATAAGCTGACGGTGACAAGGTTTCCCCCTAATCGAGACGCTGCAATAACACAGGGGCATACAGTAACCAGGCAAGAGTTCAATCGCTTAGTTTCGTGGCGGGATTTGAGGAAAACTGCGACTGTTCTTTAACCAAACATCCGTGCGATTCGTGCCACTCGTAGACGGCATCTCACAGTCACTGAAGGCTATTAAAGAGTTAGCACCCACCATTGGATGAAGCCCAGGATAAGTGACCCCCCCGGACCTTGGAGTTTCATGCTAATCAAAGAAGAGCTAATCCGACGTAAAGTTGCGGCGTTGATTACGCAGGATTGCGACCAAAGAACGAGAAAAAAAAAAAAAAAAAAAAAAAA

Tail of GTF file
>ERCC-00171	ercc	gene	1	506	.	+	.	gene_id "GERCC-00171"; gene_version "1"; gene_name "ERCC-00171"; gene_source "ercc"; gene_biotype "ercc";

samtools view -h 10BTreg02_S290_L003Aligned.sortedByCoord.out.bam ERCC-00171
>NS500597:113:HH5HKBGX5:3:11406:4418:20117	83	ERCC-00171	441	255	9S29M	=	60	-410	ACGACGTAGGTTGCGGCGTTGATTACGCAGGATTGCGA	EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAAAA	NH:i:1	HI:i:1	AS:i:65	nM:i:0
NS500597:113:HH5HKBGX5:3:21612:3414:12240	89	ERCC-00171	442	255	36M	*	0	0	TTGCGGCGTTGATTACGCAGGATTGCTACCAAAGAA	EAEEEEEEAEAEEEEE/EEEEEEEAEEEEEEAAAAA	NH:i:1	HI:i:1	AS:i:33	nM:i:1
(there's many more lines, finds other ERCC numbers as well)

tail -n 20 10BTreg02_S290_L003ReadsPerGene.out.tab
ENSG00000224240	0	0	0
ENSG00000227629	0	0	0
ENSG00000237917	0	0	0
ENSG00000231514	0	0	0
ENSG00000235857	0	0	0

That's all I have to offer.

**r.rosati** · 04-27-2018, 11:31 AM

Here I am with makeshift solutions, but if you make the BAM into a SAM, you can `grep` it to see if the sequences are there.

**alekzs** · 04-27-2018, 11:43 AM

Originally posted by r.rosati View Post

Here I am with makeshift solutions, but if you make the BAM into a SAM, you can `grep` it to see if the sequences are there.

ha, that approach was far easier...

grep "ERCC-" 10B02_3.sam -c
6770

So, yes... they are there, just don't end up in any count file.

**r.rosati** · 04-27-2018, 11:53 AM

I meant like grepping for
CTGGAGATTGTCTCGTACGGTTAAGAGCCTCCGCCC
(or any other fragment in the ERCC controls, I copy-pasted the one you wrote in a previous post)

**GenoMax** · 04-27-2018, 11:58 AM

Can you try featureCounts to do the counts? It will not count multi-mapping reads by default.

**arnollito** · 07-30-2018, 04:20 AM

Hi alekzs, how did you solve this issue in the end? Greetings from Switzerland.

**alekzs** · 07-30-2018, 06:10 AM

Originally posted by arnollito View Post

Hi alekzs, how did you solve this issue in the end? Greetings from Switzerland.

Yes... I used RSEM for the counting and the index generation with the ERCC-appended hg19 file had failed because the chr-labels weren't compatible so RSEM used an old index without ERCC genes.
I edited my fused ERCC-hg19, re-run the index step and then it worked. Hope that helps!

Topics	Statistics	Last Post
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, Yesterday, 12:03 PM	0 responses 17 views 0 reactions	Last Post by SEQadmin2 Yesterday, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, Yesterday, 11:40 AM	0 responses 13 views 0 reactions	Last Post by SEQadmin2 Yesterday, 11:40 AM
MetaBeeAI Helps Scientists Process Research Literature Faster by SEQadmin2 Started by SEQadmin2, 05-28-2026, 11:40 AM	0 responses 29 views 0 reactions	Last Post by SEQadmin2 05-28-2026, 11:40 AM
Scientists Solve a 25-Year Mystery in RNA Interference by SEQadmin2 Started by SEQadmin2, 05-26-2026, 10:12 AM	0 responses 31 views 0 reactions	Last Post by SEQadmin2 05-26-2026, 10:12 AM

Unconfigured Ad

ERCC - no gene counts

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News