Seqanswers Leaderboard Ad

**viral2143** · 02-11-2015, 12:27 PM

Did you ever find a solution to this? I am having the same problem for both 454 and Illumina MiSeq data. Thank you!

**GenoMax** · 02-11-2015, 12:32 PM

Have you written to the author(s) directly? You probably have a better chance of getting a resolution that way.

**Richard Finney** · 02-11-2015, 12:59 PM

What is the command you are using?

**viral2143** · 02-11-2015, 01:04 PM

Yes I have contacted the authors, waiting to hear back.

The command is running the predict haplo executable on the config file:
PredictHaplo-Paired config.txt

To give context, I align my reads using the Mosaik aligner. For my 454 data I know the problem can be resolved by using bwa aligner. However, I would like to know how to run PredictHaplo on the sam files produced by Mosaik.

Thanks for your help!

**NeoneX** · 02-11-2015, 03:19 PM

Hi,

I've emailed the authors directly before, but I have never gotten a reply from them. I emailed them first before I posted the question in this forum.

In the end, no. Unfortunately I am still as clueless as to why it doesn't work on my set of data. So instead of PredictHaplo, I switched algorithm to use QuasiRecomb (https://github.com/armintoepfer/QuasiRecomb/).

I wasn't able to understand why the figures reported from the output were like that. Recalling from before:

After parsing the reads in file /home/DataFiles/PredictHaplo_Files/087.sam: average read length= -nan
First read considered in the analysis starts at position 100000. Last read ends at position 0
There are 0 reads

Apologies for not solving the problem, but I decided I had to move on to something else otherwise I could be stuck for a long time hahahaha

.

If there's anyone that does understand what's happening, I am still very interested in finding out what's happening. >_<

**viral2143** · 02-18-2015, 11:54 AM

Thank you. I am now using QuasiRecomb as well and am having an issue detecting paired reads.

I run:

java -jar QuasiRecomb.jar -i alignment.sorted.bam

and get the following:

00:01:42 Parsing done
00:01:42 Start pairing
00:01:56 End pairing
00:01:56 Begin sorting
00:01:57 Finished sorting
00:01:57 Modifying reads 100%
00:01:59 Computing entropy 100%
00:02:00 Allel frequencies 100%
00:02:00 Alignment entropy 0.082
00:02:00 Unique reads 330664
00:02:00 Paired reads 0
00:02:00 Insert size 146 (±220)
00:02:00 Merged reads 305158

When I check properly aligned mate pairs in my alignment I do find properly paired mates:

samtools flagstat alignment.sorted.bam
2642674 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
1743911 + 0 mapped (65.99%:-nan%)
2642674 + 0 paired in sequencing
1321337 + 0 read1
1321337 + 0 read2
143036 + 0 properly paired (5.41%:-nan%)
1679356 + 0 with itself and mate mapped
64555 + 0 singletons (2.44%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

Are you able to use QuasiRecomb to detect paired mates?
Thanks again.

**Luiky** · 03-20-2015, 03:38 AM

Hello everyone,

I solved that issue changing the "%min_readlength" in the configuration file. It has 220 by default but my HiSeq Illumina reads only have 100 nt length, so that was the solution.

Changing this parameter, PredictHaplo worked perfectly.

**rjorton** · 02-11-2016, 05:22 AM

We had the same issue - looks like it was down to the sam file format - our reads were originally aligned with bowtie2 which gave the PredictHaplo error - but using bwa instead resolved the error

**bede** · 03-01-2016, 07:48 AM

Hi everyone,
I found this thread after testing Bowtie2 and PredictHaplo.

Using BWA I am having similar issues – only a tiny proportion of reads are being recognised by PredictHaplo. In this test case of 2x150 NextSeq viral sequences, only 154 of the 70k mapped reads in this subsampled SAM are detected according to the output (see below). In Tablet everything looks fine with the SAM and the pairings are recognised.

I have even tried an older build of BWA to see if un update might have caused the issue. I don't have any strange characters or line endings in my reference sequence, and am at a loss as to what could be causing this issue.

Does anyone have any ideas? Has anyone had responses from the authors?

bede@ubuntu:~/ph/PredictHaplo-Paired-0.4$ ./PredictHaplo-Paired config_test
config_test
0 hrv_21_sub_
0 % filename of reference sequence (FASTA)
1 /home/bede/hrv_21/hrv1b.cns.fa
1 % do_visualize (1 = true, 0 = false)
2 1
2 % filname of the aligned reads (sam format)
3 /home/bede/hrv_21/SM_21A_S14.1pc.bwa_old.sam
3 % have_true_haplotypes (1 = true, 0 = false)
4 1
4 % filname of the true haplotypes (MSA in FASTA format) (fill in any dummy filename if there is no "true" haplotypes)
5 truehaps.fasta
5 % do_local_analysis (1 = true, 0 = false) (must be 1 in the first run)
6 1
6 % max_reads_in_window;
7 10000
7 % entropy_threshold
8 4e-2
8 %reconstruction_start
9 9
9 %reconstruction_stop
10 6950
10 %min_mapping_qual
11 20
11 %min_readlength
12 50
12 %max_gap_fraction (relative to alignment length)
13 0.05
13 %min_align_score_fraction (relative to read length)
14 0.35
14 %alpha_MN_local (prior parameter for multinomial tables over the nucleotides)
15 25
15 %min_overlap_factor (reads must have an overlap with the local reconstruction window of at least this factor times the window size)
16 0.85
16 %local_window_size_factor (size of local reconstruction window relative to the median of the read lengths)
17 0.7
17 % max number of clusters (in the truncated Dirichlet process)
18 25
18 % MCMC iterations
19 501
19 % include deletions (0 = no, 1 = yes)
20 1
20
rm: cannot remove ‘hrv_21_sub_*.fas’: No such file or directory
rm: cannot remove ‘hrv_21_sub_*.lab’: No such file or directory
rm: cannot remove ‘hrv_21_sub_*.reads’: No such file or directory
rm: cannot remove ‘hrv_21_sub_*.html’: No such file or directory
rm: cannot remove ‘hrv_21_sub_*.pgm’: No such file or directory
After parsing the reads in file /home/bede/hrv_21/SM_21A_S14.1pc.bwa_old.sam: average read length= 104.409 154
First read considered in the analysis starts at position 9. Last read ends at position 6950
There are 154 reads
Median of read lengths: 104.500
Local window size: 73
Minimum overlap of reads to local analysis windows: 62
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped)

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Predict Haplo 1.0 Issues

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News