![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
BWA sam and Samtools sam->bam conversion problem | maasha | Bioinformatics | 6 | 06-05-2013 08:39 AM |
.bam to .wig conversion | kalidaemon | Bioinformatics | 7 | 05-10-2012 03:39 PM |
casava 1.8 bam conversion to gatk bam | kingsalex | Bioinformatics | 1 | 02-14-2012 12:47 PM |
Merge sai file of bwa ? | louis7781x | Bioinformatics | 5 | 12-20-2011 04:00 PM |
ANN: New I/O Code in SeqAn (includes BAM/SAM I/O) | holtgrewe | Bioinformatics | 2 | 09-26-2011 12:14 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Valencia Join Date: Nov 2011
Posts: 44
|
![]()
Hi
I am trying to test BWA with 454 read data larger than 200 nt using the bwasw option and the hg19 as indexing reference. BWA generates a ooutpu.sai which i try to convert to sam format and here is the problem. bwa gives the4 following message [bns_restore_core] fail to open file 'hg19.nt.ann'. Abort! Aborted The point is that I have not idea about what bwa ask me for the file hg19.nt.ann or what is the hg19.nt.ann file. This file is not generated with the other index files when I run the index function, so i am confusing. I checked the forum about other similar messages and surprinsingly I have found very little (almost nothing clarifying to my doubt) about this. Can anyone clarify me if this file xxx.nt.ann is normal output of bwa and how I can create it for converting a sai file to bam Thank you in advance. |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: bethesda Join Date: Feb 2009
Posts: 700
|
![]()
Please provide your commands to indexing, generating sai and generating sam using bwa.
The "nt" means "color indexing". I don't think 454 is color space. You might be using somebody's color space script as a template for your work and you need to modify it. Last edited by Richard Finney; 01-18-2012 at 07:59 AM. |
![]() |
![]() |
![]() |
#3 |
Member
Location: Valencia Join Date: Nov 2011
Posts: 44
|
![]()
Hi Richard thank you for your answer
Effectively 454 is not in color space. Maybe i am doing somthing wrong, I do not know. I used the extract_sff script to convert sff to fastq and then prinseq to process the fastq @GCFF90V02JNZWW CATTTGTTCACTCATAATAAGAAAGTAGGGAGAGGAGAATGTTAACATACCTATAGATAATACATGCACTGTTCCTGCATGT +GCFF90V02JNZWW AB===B>>:::<<<=<<311/,,,242,,,/.89<?=889::ADA===AADFDDAAADDD??????ABBBABB==9:::=BB @GCFF90V02G5MHK ATATATGCTTTCATGAGAATGAGAGAGTCCTTCGAGCTGTAG +GCFF90V02G5MHK IIIIIIIIHHHIIIIIIFFFFFFFFFFFFFFF===@FFFFDD Then I used BWA for creating the hg19 index using ./bwa index -a bwtsw -p hg19 hg19.fa (so i did not use -c) for the alignment I first used ./bwa aln and the bwa worked although only aligned the shrotest reads as it may be expected. Then I converted this sai output to bam and had not problems in doing that. Next and here comes my troubles, I used bwast for testing bwa with larger reads using the following ./bwa bwasw -t 4 -f out.sai hg19 454reads7.fastq Bwa generated the out.sai and then went again to samse to convert this said, as previously did with that of the shortest reads. ./bwa samse -f out.sam hg19 input.sai input.fastq That is exactly the same I did with the short reads. Any suggestion Carlos |
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: bethesda Join Date: Feb 2009
Posts: 700
|
![]()
Where's the aln step for the .sai generation before the bwasw command? The fastq must be the same.
|
![]() |
![]() |
![]() |
#5 |
Member
Location: Valencia Join Date: Nov 2011
Posts: 44
|
![]()
and it was
That is that I used. ./bwa aln -t 4 -f out.sai hg19 454reads7.fastq In fact, I was repeating right now the steps I have the same result Copy here the commands. Using aln cllorens@biotechvana:~/assembling/tools/bwa/bwa-0.5.9> ./bwa aln -t 4 -f destruye.sai hg19 454reads7.fastq [bwa_aln] 17bp reads: max_diff = 2 [bwa_aln] 38bp reads: max_diff = 3 [bwa_aln] 64bp reads: max_diff = 4 [bwa_aln] 93bp reads: max_diff = 5 [bwa_aln] 124bp reads: max_diff = 6 [bwa_aln] 157bp reads: max_diff = 7 [bwa_aln] 190bp reads: max_diff = 8 [bwa_aln] 225bp reads: max_diff = 9 [bwa_aln_core] calculate SA coordinate... 1048.13 sec [bwa_aln_core] write to the disk... 1039.00 sec [bwa_aln_core] 218634 sequences have been processed. then sai to bam conversion... cllorens@biotechvana:~/assembling/tools/bwa/bwa-0.5.9> ./bwa samse -f destruyeme.sam hg19 destruye.sai 454reads7.fastq [bwa_aln_core] convert to sequence coordinate... 4.05 sec [bwa_aln_core] refine gapped alignments... 17.34 sec [bwa_aln_core] print alignments... 1.11 sec [bwa_aln_core] 218634 sequences have been processed. Now if i use bwasw with the same fastq cllorens@biotechvana:~/assembling/tools/bwa/bwa-0.5.9> ./bwa bwasw -t 4 -f destruye2.sai hg19 454reads7.fastq [bsw2_aln] read 29176 sequences (10000406 bp)... [bsw2_aln] read 28182 sequences (10000061 bp)... [bsw2_aln] read 29264 sequences (10000170 bp)... [bsw2_aln] read 30374 sequences (10000003 bp)... [bsw2_aln] read 31893 sequences (10000054 bp)... [bsw2_aln] read 33994 sequences (10000276 bp)... [bsw2_aln] read 35751 sequences (9642318 bp)... cllorens@biotechvana:~/assembling/tools/bwa/bwa-0.5.9> ls and now using the sai generated in this case: cllorens@biotechvana:~/assembling/tools/bwa/bwa-0.5.9> ./bwa samse -f destruye2.sam hg19 destruye2.sai 454reads7.fastq [bns_restore_core] fail to open file 'hg19.nt.ann'. Abort! Aborted Any idea? |
![]() |
![]() |
![]() |
#6 |
Senior Member
Location: bethesda Join Date: Feb 2009
Posts: 700
|
![]()
Check you read lengths.
It's not explaining the error message, but please check Heng Li's (author of BWA) notes here : http://bio-bwa.sourceforge.net/ Does BWA align 454 reads? Yes and no. The BWA-SW component of BWA works well on 454 reads about 200bp or longer. It achieves similar alignment accuracy to SSAHA2 while much faster. BWA-SW also works for shorter reads, but the sensitivity is lower. In addition, BWA-SW does not support paired-end alignment. What is maximum query sequence length in alignment? It is recommended to only use bwa-short on reads shorter than 200bp. |
![]() |
![]() |
![]() |
#7 |
Member
Location: Valencia Join Date: Nov 2011
Posts: 44
|
![]()
There several sizes Richard including 500 nucleotides or even larger (750).
Perhaps the problem could be due to the fact that both reads smaller and larger than 200 are collected in the same input file. I think i going to try to separate them in two independent files (short and large than 200) to see what happens. It is just an idea but let me see if there is something new in doing so. Carlos |
![]() |
![]() |
![]() |
#8 |
Member
Location: Valencia Join Date: Nov 2011
Posts: 44
|
![]()
Hi
I did the test to separate reads larger and shortest than 200 nt in two different fastq files and then tried to use bwasw with the fastq with seqs > 200. Again after doing this I attempted to switch the format from sai to bam and again bwa aborted the process asking me for the indexfile.nt.ann index file. So in my humble opinion this might be a bug in the bwasw algorithm. In fact, while the option aln for short reads gives a message like this at the end of the alignment process [bwa_aln_core] calculate SA coordinate... 1048.13 sec [bwa_aln_core] write to the disk... 1039.00 sec [bwa_aln_core] 218634 sequences have been processed. The point is that the option bwasw does not give such an output. |
![]() |
![]() |
![]() |
#9 |
Member
Location: one does not simply approximate location Join Date: Dec 2011
Posts: 10
|
![]()
I had that problem and it's solved by this method
When making index, use -p and -c e.g. your fasta file: seq.fa your fasta file and bwa program is located in ~/Desktop/BWA make sure you use full path for everything: ~/Desktop/BWA/bwa index -a bwtsw -p ~/Desktop/BWA/seq.fa -c ~/Desktop/BWA/seq.fa Last edited by mitochy; 01-23-2012 at 02:38 AM. |
![]() |
![]() |
![]() |
#10 |
Member
Location: Valencia Join Date: Nov 2011
Posts: 44
|
![]()
Hi Mitochy
thank you for your commento. Perhaps i am wwrong but I think is not the same problem. -c is for creating color space indexes and certainly the indexfile.nt.ann file is for color space. The point is that I am using here fastq files generated by 454 (i.e. not space colored) and when i try to use the option bwasw for creating the sai file it create it but it fails later with trying to convert from sai to sam. In my last post i wrote from sai to bam but i was talking about sam. |
![]() |
![]() |
![]() |
#11 |
Junior Member
Location: US Join Date: Apr 2011
Posts: 6
|
![]()
Could you check the definition lines in your reference fasta file (i.e. the one that you are aligning your reads to), and remove any descriptions in these lines?
E.g. if you have lines that looks like: >contig3223 hg19.ann Change it to: >contig3223 I had the same problem and doing so should fix it. Hao |
![]() |
![]() |
![]() |
#12 |
Member
Location: Valencia Join Date: Nov 2011
Posts: 44
|
![]()
Hi Hao
The reference is the human genome and the sequences are the distinct chromosome sequences organized in karyotipic format (i.e. 1,2,...22 X,Y,M) and labeled as >chr1... etc only. That is not the problem. Thank you anyway. |
![]() |
![]() |
![]() |
#13 |
Junior Member
Location: San Francisco Join Date: Dec 2012
Posts: 1
|
![]()
Hi cllorens,
Were you ever able to resolve this? I am seeing the same behavior with bwasw. I am using simulated 454 reads. The alignment works properly, but the conversion from sai to sam tries to load a colorspace index. Thanks. Last edited by 9taylors; 12-31-2012 at 05:43 AM. Reason: removed name |
![]() |
![]() |
![]() |
#14 |
Junior Member
Location: Bielefeld, Germany Join Date: Nov 2010
Posts: 3
|
![]()
Hi,
I had the same problem. Reason: The fasta file was indexed with bwa version 0.6.2, while I tried to run aln and sampe with bwa version 0.5.8. After using the same version for both, the problem disappeared. Cheers, David |
![]() |
![]() |
![]() |
#15 |
Junior Member
Location: Dallas Join Date: Dec 2011
Posts: 8
|
![]()
I encountered the same problem and it turned out that there are two versions of bwa on server and I used lower version to generate index.
|
![]() |
![]() |
![]() |
#16 |
Member
Location: Pacific Northwest Join Date: Sep 2010
Posts: 13
|
![]()
I'm having the same problem with samse using bwa version 0.5.7:
[bns_restore_core] fail to open file 'FRA_genome/Fvesca.ann'. Abort! I'm using Illumina reads, so the length should not be an issue. And the .ann file is actually present. Any ideas? |
![]() |
![]() |
![]() |
#17 |
Junior Member
Location: UK Join Date: Feb 2012
Posts: 5
|
![]()
Hi,
How does BWA handle the coordinated sorted fastq files (obtained from coordinated sorted bam file)? Do I need to shuffle the bam files before converting them to fastqs? |
![]() |
![]() |
![]() |
Thread Tools | |
|
|