Unconfigured Ad

**deKoch13** · 03-11-2019, 01:19 AM

more details

Maybe I should add some information:
We took one sample and generated the BAM files using three different pipelines.
At the moment, we are only interested in the read names (first column of the BAM files) and want to find out which reads are present in all BAM files, which are present in file 1, file 2, file 3...

**GenoMax** · 03-11-2019, 06:06 AM

You could simply get the names (field 1 as you already note, sort | uniq them in bash) and do a "comm" comparison of the three results. If your aim is just to find which reads are present in all three files.

**deKoch13** · 03-11-2019, 06:22 AM

progress

Thank you for the answer!

I already extracted the read names from all files separately using:

> samtools sort -n bam_filename | samtools view | awk -F "\t" '{print $1}' > output_filename

Now, my supervisor supposed to use python to do the rest of the task...
Or can you recommend another possibility?

Greetings

**deKoch13** · 03-11-2019, 06:28 AM

I looked the "comm" command up. Sounds promising, but I am not sure if this works for such big data files with > 1 Million reads. Do you have an idea for a smart python-based solution?

Nevertheless, I will try it also using comm.

Greetings

**GenoMax** · 03-11-2019, 06:37 AM

If this is an assignment then use what you have to but comm should work (as long as you have enough RAM available). Since you are working with only read names (if you are not then you should).

Topics	Statistics	Last Post
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 11 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 23 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 28 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, 06-02-2026, 11:40 AM	0 responses 22 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 11:40 AM

Unconfigured Ad

Working with BAM files

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News