I have a BWA alignment of reads to a small (50 kb) reference sequence. It is a transgenic sequence inserted into a host cell genome. I want to be able to locate the insert's position in the host cell genome. There are reads at the ends (pointing outwards) which have their pairs unmapped. These mates would presumably be in the flanking genomic sequence that I want to identify. Is there an easy way to get the unmapped mates? I suppose I could make a list of the reads and write a script to parse the original fastQ files, but I am hoping there is a tool already available for this (seemingly common) purpose. Any help would be greatly appreciated.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
-
You could parse the .bam for unmapped reads whose mates mapped close to your boundaries in the correct orientation.
You could also align the fastqs to the 50kb genome and the host genome, then filter for reads that aligned to the host whose mates aligned to the insert. That's probably the best solution. You'd want the mapping position of the reads that aligned to host anyway, so this way you'd have them.
Comment
-
Covid-19
I am confused on the issue:
the service provider company provide AmpliSeq for Illumina On-Demand, Custom, and Community Panels. for COVID diagnostic, the library was prepared, the issue started @ sample sheet, manifest file - covid- successfully added, genome (we try our level best to integrate the genome file but no use, after creating multipath the genome was integrated in sample sheet and run started, output was 93.8=Q score, Cluster passing 96.7%, Cluster density 774K) but analysis failed (Sunday) till now we try all possible methods with illumina support but no use. initially RNA amplicon was downloaded and added in sample sheet, the sample sheet was headed by DNA amplicon (no use) & now PCR amplicon was added in sample sheet but same error. plz guide.
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 11:49 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
Today, 11:49 AM
|
||
Started by seqadmin, Yesterday, 08:47 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
Yesterday, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Comment