![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Mapping SOLiD colorspace paired end reads | NestorNotabilis | SOLiD | 10 | 12-12-2012 07:14 PM |
how do I output the CS tag for BWA align of SOLID reads? | KevinLam | Bioinformatics | 16 | 07-23-2011 11:06 PM |
BWA mapping colorspace reads | Todd Scheetz | SOLiD | 2 | 08-25-2010 07:16 PM |
sam output from bwa colorspace alignment | Mr Mutundes | Bioinformatics | 0 | 12-15-2009 04:02 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: US Join Date: Jun 2009
Posts: 5
|
![]()
Hi all,
I'm using bwa for mapping SOLiD paired reads to the reference genome. After going through the bwa aln and bwa samse/sampe stages I get output in the SAM format. Is this SAM output in colorspace? If so are there tools to convert the SAM format to nucleotide space so that I can generate pile ups in nucleotide space? Eg. of the output I'm getting for a mapped single-end read is as follows: ./fastq_files/Part_0:6_32_1000 0 chr6 112832228 37 49M = 112832228 0 TNTAGGTAGTGTATTAAATGGCGACAGGACTGGGGGACCCCAGCGCCAA @!9:79,676=*+98:&2(>;5&315+(9:41+8>58-5<18745;0)+ XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:3 XO:i:0 XG:i:0 MD:Z:1T35A5T5 Here columns 10, 11 which report the query sequence and the qualities are shown in color space Any help would be appreciated. Thanks N |
![]() |
![]() |
![]() |
#2 | |
Nils Homer
Location: Boston, MA, USA Join Date: Nov 2008
Posts: 1,285
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: Chicago Join Date: May 2008
Posts: 9
|
![]()
I have the same question here. Does anyone know the answer?
|
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: Sweden Join Date: Mar 2008
Posts: 324
|
![]() |
![]() |
![]() |
![]() |
#5 |
Nils Homer
Location: Boston, MA, USA Join Date: Nov 2008
Posts: 1,285
|
![]() |
![]() |
![]() |
![]() |
#6 |
Junior Member
Location: Chicago Join Date: May 2008
Posts: 9
|
![]()
Does it? I am attaching a screenshot of the alignment (using tview). It just does not make sense to me. And the pileup file I got from "samtools pileup" command shows that the consensus is different than the reference sequence at almost every position..
|
![]() |
![]() |
![]() |
#7 |
Senior Member
Location: Sweden Join Date: Mar 2008
Posts: 324
|
![]()
I can't see the screenshot, but have you checked that you are using the correct fastQ format and an index in cs-format, and are using aln -c? I think I have made all these mistakes at some point with starnge results...
|
![]() |
![]() |
![]() |
#8 |
Junior Member
Location: Chicago Join Date: May 2008
Posts: 9
|
![]()
Thanks, Chipper. I have not been able to attach it for some reason.
Regarding the NGS exercise, I might have done something wrong at some step then. There wasn't any error or warning along the way, so there was no clue. I tried to post the same question on the samtools-help list. I am copying it below and see if it helps you see my question better. Thanks in advance. Can someone provide me some pointers regarding SAM format in color space and correct ways to use samtools for processing such SAM files, especially for SNP and indel calling? I looked everywhere but could not find any documentations. Specifically, as an exercise, here is what I did: - Simulated some SOLiD reads using wgsim (-c option) from a reference sequence. - Generated the bwa index with the following command: bwa index -c ref.fa -a is - Align the reads (in fastq format) back to the reference sequence using bwa: bwa aln -c ref.fa r1.fq > r1.sai bwa samse ref.fa r1.sai r1.fq > r1.sam And I ran the usual faidx, import, sort, index, and pileup commands of samtools and they went smoothly with no errors or warnings. I can view it with samtools tview. Nonetheless, the pileup file just does not make sense to me, as the consensus sequence is almost different to the reference sequence at every position. And, tview seems to be showing the reads still in color space (double encoded?), which is hard or impossible to interpret for me. |
![]() |
![]() |
![]() |
#9 | |
Nils Homer
Location: Boston, MA, USA Join Date: Nov 2008
Posts: 1,285
|
![]() Quote:
Here are the files I have for hg18. Instead of the ref.fa above, I would use ref.cs.fa for both the aln and samse commands! Code:
[bash$] ls -1 hg18.cs.fa.amb hg18.cs.fa.ann hg18.cs.fa.bwt hg18.cs.fa.nt.amb hg18.cs.fa.nt.ann hg18.cs.fa.nt.pac hg18.cs.fa.pac hg18.cs.fa.rbwt hg18.cs.fa.rpac hg18.cs.fa.rsa hg18.cs.fa.sa hg18.fa |
|
![]() |
![]() |
![]() |
#10 |
Junior Member
Location: Chicago Join Date: May 2008
Posts: 9
|
![]()
Thanks, Nils.
Did you have to do something first to generate the .cs.fa file? I ran the command: > bwa index -a is -c ref.fa And I got the following files: ref.fa.amb ref.fa.bwt ref.fa.nt.ann ref.fa.pac ref.fa.rpac ref.fa.sa ref.fa.ann ref.fa.nt.amb ref.fa.nt.pac ref.fa.rbwt ref.fa.rsa And there is no ref.cs.fa to be found anywhere. Btw, I did manage to compile bfast a couple of hours ago on my MacBook Pro. I might have some questions for you if you don't mind. |
![]() |
![]() |
![]() |
#11 | |
Nils Homer
Location: Boston, MA, USA Join Date: Nov 2008
Posts: 1,285
|
![]() Quote:
Code:
/share/apps/bwa-0.4.9/bwa index -a bwtsw -p hg18.cs.fa -c hg18.fa |
|
![]() |
![]() |
![]() |
#12 |
Junior Member
Location: Chicago Join Date: May 2008
Posts: 9
|
![]()
-p option was indeed the reason. You have to specify it, although it seems to be optional (default is said to be the fasta name). It fixed my problem, although I am still puzzled by the alignment result that I got previously. I wish I could figure out the way to attach the file here, as you will see what I meant. Thanks, Nils.
|
![]() |
![]() |
![]() |
#13 | |
Nils Homer
Location: Boston, MA, USA Join Date: Nov 2008
Posts: 1,285
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#14 |
Member
Location: Lausanne Join Date: Dec 2009
Posts: 19
|
![]()
I've posted somewhere else, more appropriate (in the bioinformatics section) because it's not about solid reads.
Hi, I did the indexing with bwtsw and no -p and I got the following files : Mouse_genome.fa.amb Mouse_genome.fa.ann Mouse_genome.fa.bwt Mouse_genome.fa.pac Mouse_genome.fa.rbwt Mouse_genome.fa.rpac Mouse_genome.fa.rsa Mouse_genome.fa.sa I managed to get the .sai file from the aln command, but now I'm stuck because the samse command gives me the error: fail to open file '../Mouse_genome.fa.nt.ann'. Abort! But I never get the .nt.ann file with indexing. I'm confused. Last edited by ikrier; 01-07-2010 at 05:06 AM. |
![]() |
![]() |
![]() |
#15 | |
Nils Homer
Location: Boston, MA, USA Join Date: Nov 2008
Posts: 1,285
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#16 |
Member
Location: Lausanne Join Date: Dec 2009
Posts: 19
|
![]()
bwa samse ../Mouse_genome.fa tags_all.sai tags_all.fastq > tags_all.sam
|
![]() |
![]() |
![]() |
#17 |
Nils Homer
Location: Boston, MA, USA Join Date: Nov 2008
Posts: 1,285
|
![]() |
![]() |
![]() |
![]() |
#18 |
Member
Location: Lausanne Join Date: Dec 2009
Posts: 19
|
![]()
bwa aln ../Mouse_genome.fa tags_all.fastq > tags_all.sai
|
![]() |
![]() |
![]() |
#19 | |
Junior Member
Location: china Join Date: Jan 2010
Posts: 2
|
![]()
what's is your edition of BWA ?
BWA is perfect for Solexa reads, but have some bug for Solid reads. the reads on mins strand have a complentary sequencse ,both single and pair-end reads are have this problem, from some edition of BWA. use 4.9 try again, it is stable for my result. Quote:
|
|
![]() |
![]() |
![]() |
#20 |
Member
Location: Lausanne Join Date: Dec 2009
Posts: 19
|
![]()
I'm removing my post here because it's not about Solid reads...
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|