I aligned RNA-Seq reads to the reference genome. However, reads that span splice junctions will not align. What is the best way to handle this? I've read that aligning to a reference transcriptome is not the answer. Any thoughts? Thanks!
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
You might want to try TopHat/Bowtie.
Alternatively, you can download all mRNA sequences from UCSC and align against that.
Sam
-
So I used tophat 2 to align the data. I used multiple cores but ran it on the default settings.
I have tumor and normal samples (no replicates, single end).
My accepted_hits.bam for the normal samples is 479MB while the unmapped.bam file is 3.4 GB. I was wondering what I might be able to do about this? I am re-aligning the unaligned reads in BWA, but I'm not sure if this is the best approach. Any thoughts? There should not be so few mapped reads for this sample. The tumor samples had an unmapped.bam file of size 796 MB and an accepted_hits.bam file of size 479 MB. Although this is better, it still seems low.
The normals should actually be better than the tumors due to sample quality.
Thanks in advance for your help!
Comment
-
Did you quality trim before running the alignment? If not and a lot of your reads deteriorated in quality toward the end, that could be the issue. You might also try blasting a couple of the unmapped reads just to ensure there's nothing weird going on.
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 11:49 AM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
Today, 11:49 AM
|
||
Started by seqadmin, Yesterday, 08:47 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
Yesterday, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Comment