Seqanswers Leaderboard Ad

**westerman** · 03-10-2011, 09:44 AM

The BAM file should have the insert ranges. But for a quick overview my understanding is that the lower and upper ranges in the pairing.dat.freq file (not the full file) gives the insert range. On the other hand, the recent LifeTech 'pairing_stats_n_clean_bam' which is supposedly generating 'official' statistics gives a different (and smaller) range than pairing.dat.freq file. Since the new program is looking through the BAM file (and taking forever to do so!) I'd trust it more.

**jbeck** · 03-10-2011, 10:59 AM

Thank you westerman. Yes that was exactly my confusion. The pairing.stats gives:
Insert range 62-207 in the header, while the pairing.dat.freq file gives values from 35-207. If I want to take the 'official' numbers for AAA pairs from the pairing.stats which range do you think is used?

What do you mean with 'new program'

best regards

Julia

**westerman** · 03-10-2011, 11:16 AM

Ah, you have a 'pairing.stats' file. This indicates that you ran your analysis with bioscope version 1.2 -- the version before LifeTech took away the stats file. In v.1.3. they did away with the stats file but, within the last couple of weeks, they issued a program called 'pairing_stats_n_clean_bam' which restores the stats file as well as cleans up the mapped reads BAM file (which, erroneously, has unmapped reads in it.) You should not run the 'pairing_stats_n_clean_bam' program on v.1.2 and earlier files.

In your case just take the range from 'pairing.stats'.

**jbeck** · 03-10-2011, 09:25 PM

Ah, most interesting! This might also explain another observation I made:

I used Picard to remove duplicate reads in the BAM file. Picard reported a number of 'records' that did not match any of the numbers reported in the pairing.stats. I was scratching my head about this, too. Maybe it's best to switch to v. 1.3. - too much muddle here.

THX J

**jbeck** · 03-11-2011, 09:00 AM

Hey, it's me again.

In the meantime things cleared up. The faulty BAM files were introduced by v1.3. So it's better to stick with v1.2 right now. Right?

I take the size range from the pairing.stats! BTW- it matches the values in the pairing.dat.freq, it did not before because I took the wrong file from a another library:-( Sorry for the confusion.

The problem with the 'records' count acc. to Picard still remains. But I will try to figure this out next week.

I go for weekend now.

THX J

**westerman** · 03-11-2011, 01:14 PM

Originally posted by jbeck View Post

... The faulty BAM files were introduced by v1.3. So it's better to stick with v1.2 right now. Right?

You can use v.1.3 but should run the 'pairing_stats_n_clean_bam' program if you want a clean BAM file. Be aware that said program takes a long time to run. I hesitate to tell someone to not use the latest and greatest version since there should be bug fixes and speed-ups between v1.2 and v1.3.

Topics	Statistics	Last Post
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, Yesterday, 06:57 AM	0 responses 11 views 0 likes	Last Post by seqadmin Yesterday, 06:57 AM
Enhanced Neoantigen Detection: Introducing NeoHunter by seqadmin Started by seqadmin, 05-06-2024, 07:17 AM	0 responses 16 views 0 likes	Last Post by seqadmin 05-06-2024, 07:17 AM
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 19 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM

Seqanswers Leaderboard Ad

Announcement

Mysteries of the Bioscope pairing pipeline

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News