Anyone aware of a tool that can convert a pileup file to BAM files.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
There must be an easier way to get what you want than resurrecting a bam from pileup. You could look at the files upstream of the pileup and see if they can be translated to bam format, or, on the other hand, go forward to create a vcf (or some other bed/gff file) from the pileup and annotate that.
Comment
-
I find myself in this situation using legacy data (e.g., original sequencing files from an old project were lost but pileups are still available). I wanted to bump this thread to see if anyone has come up with a good solution to this problem- any utilities out there that can convert a pileup to a bam or something like it?
Comment
-
You need to be more specific about the pileup format that is used. A text-based format like the output of 'samtools mpileup' would be easier to convert to SAM than a graphical image, for example. But even given the huge amount of information that could be extracted from 'samtools mpileup', it doesn't store everything. For example, the sequence names are not preserved, and no pairing information is retained.
Any pileup output which is nothing more than coverage values without linking between different bases will not be sufficient for generating a source SAM file.
Comment
-
That's a good point- I'm talking about the text-based format, and the experiment was run single-end so pairing isn't important. It's true that I wouldn't have the original read names, but naively that doesn't seem very important- is there software that actually uses those, beyond for read pairing?
That said, I suppose that I hadn't considered whether the pileup format explicitly maintains the individual bases within reads. I had always assumed that the base calls at each position were ordered readwise, but it occurs to me that this may not be the case.
Anyway, it sounds like a tool to do this conversion probably doesn't exist. Thank you for your input.
Comment
Latest Articles
Collapse
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
31 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
32 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
53 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment