SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
bowtie priduced SAm file does not have RG tags in alignment neha General 3 03-12-2013 02:47 AM
CIGAR field of SAM file from Bowtie alignment qwsqe Bioinformatics 4 05-08-2012 05:28 AM
sam output from bwa for SOLiD reads in colorspace? nisha SOLiD 19 01-07-2010 04:05 AM
sam output from bwa colorspace alignment Mr Mutundes Bioinformatics 0 12-15-2009 03:02 AM

Reply
 
Thread Tools
Old 01-31-2017, 01:12 PM   #1
Alexis1
Junior Member
 
Location: Chile

Join Date: Feb 2014
Posts: 2
Post Sam file from alignment of Solid Colorspace reads using Bowtie

Hi everyone, I am mapping color-space fasta and QV files (sequenced in ABI SOLID) from a human exome, using the reference given by the GATK resources (Kariotype sorted). I am using Bowtie for mapping it (indexing and mapping with -C flag). my code line was:

Code:
bowtie -p 40 --best --strata -a --mapq 60 --chunkmbs 1000 human_ref38 -f -C my_file.csfasta -Q my_file.qual -S align.sam
My main goal is do a variant calling analysis, using GATK.

The results look like good (I have saw that in general the number of mapped reads for color space format is very low):

Code:
# reads processed: 9540575
# reads with at least one reported alignment: 7585206 (79.50%)
# reads that failed to align: 1955369 (20.50%)
Reported 188945896 alignments to 1 output stream(s)
My doubt is regarding the SAM file output, It has not the the alignment section bellow the header @PG (It is the end of my SAM file), whit the mandatories fields as QNAME, FLAG, RNAME, etc, But the most important for me, is that it lacks the mapping quality (Fifth column normally).

This is the beginning and end of my SAM file:

Code:
@HD	VN:1.0	SO:unsorted
@SQ	SN:chr1	LN:248956422
@SQ	SN:chr2	LN:242193529
@SQ	SN:chr3	LN:198295559
@SQ	SN:chr4	LN:190214555
@SQ	SN:chr5	LN:181538259
@SQ	SN:chr6	LN:170805979
@SQ	SN:chr7	LN:159345973
@SQ	SN:chr8	LN:145138636
@SQ	SN:chr9	LN:138394717
@SQ	SN:chr10	LN:133797422
@SQ	SN:chr11	LN:135086622
@SQ	SN:chr12	LN:133275309
@SQ	SN:chr13	LN:114364328
@SQ	SN:chr14	LN:107043718
@SQ	SN:chr15	LN:101991189
@SQ	SN:chr16	LN:90338345
@SQ	SN:chr17	LN:83257441
@SQ	SN:chr18	LN:80373285
@SQ	SN:chr19	LN:58617616
@SQ	SN:chr20	LN:64444167
@SQ	SN:chr21	LN:46709983
@SQ	SN:chr22	LN:50818468
@SQ	SN:chrX	LN:156040895
@SQ	SN:chrY	LN:57227415
@SQ	SN:chrM	LN:16569
@SQ	SN:chr1_KI270706v1_random	LN:175055
End:
Code:
@SQ	SN:HLA-DRB1*13:01:01	LN:13935
@SQ	SN:HLA-DRB1*13:02:01	LN:13941
@SQ	SN:HLA-DRB1*14:05:01	LN:13933
@SQ	SN:HLA-DRB1*14:54:01	LN:13936
@SQ	SN:HLA-DRB1*15:01:01:01	LN:11080
@SQ	SN:HLA-DRB1*15:01:01:02	LN:11571
@SQ	SN:HLA-DRB1*15:01:01:03	LN:11056
@SQ	SN:HLA-DRB1*15:01:01:04	LN:11056
@SQ	SN:HLA-DRB1*15:02:01	LN:10313
@SQ	SN:HLA-DRB1*15:03:01:01	LN:11567
@SQ	SN:HLA-DRB1*15:03:01:02	LN:11569
@SQ	SN:HLA-DRB1*16:02:01	LN:11005
@PG	ID:Bowtie	VN:0.12.5	CL:"/home/alsalas/.linuxbrew/bin/bowtie/bowtie -p 40 --best --strata -a --mapq 60 --chunkmbs 1000 human_ref38 -f -C my_file.csfasta -Q my_file.qual -S align.sam"
So, why I can not get these required fields in the SAM file?, Even I used the flag --mapq = 60 (Normally Bowtie should report mapq as 0 and 255) but it does not inform anything.

I need the mapq values to do the Base quality score recalibration with GATK and others analysis. By the moment, GATK rules out all my mapped reads, and it informs that = (100.00% of total) failing MappingQualityUnavailableFilter.

Previously I have converted the color-space fasta and QV files to fastq, but in some cases it is not possible (errors when the script run into that), and in others cases the fastq obtained is very unreliable.

Maybe is not possible get the alignment fields in the Bowtie Sam file using this kind of data?.

Thanks in advance and any suggestion or comment is very well received.


Alexis.
Alexis1 is offline   Reply With Quote
Old 01-31-2017, 05:00 PM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
The results look like good (I have saw that in general the number of mapped reads for color space format is very low):
No surprise there; the Solid platform had terrible quality, which is why it is now extinct.

Can you describe what you are trying to do, and why you are using Solid data to do it? I highly recommend not using Solid. In most cases, I think it's much more cost effective to sequence on Illumina and throw away Solid data, than analyze Solid data.
Brian Bushnell is offline   Reply With Quote
Old 02-01-2017, 04:51 AM   #3
Alexis1
Junior Member
 
Location: Chile

Join Date: Feb 2014
Posts: 2
Default

Quote:
Originally Posted by Brian Bushnell View Post
No surprise there; the Solid platform had terrible quality, which is why it is now extinct.

Can you describe what you are trying to do, and why you are using Solid data to do it? I highly recommend not using Solid. In most cases, I think it's much more cost effective to sequence on Illumina and throw away Solid data, than analyze Solid data.
Hi Brian. I agree. Honestly, I think that color space Solid is not a very good option. In fact, are not there many softwares to deal with it, and the options to conversion to fastq (base space), are not too reliable.

Well, I'm trying to disclosing variants (SNPs and Indels VCF) in a human exome, associated to cancer. So, my pipeline to do this is:

Code:
Map to reference (Bowtie) --> Sam to Bam - statistics (Samtools - Picard) --> Post-alignment processing ( Remove duplicates - InDel realignment -  Base quality score recalibration) using Picard and GATK --> Variant calling (GATK) --> Annotate variants (ANNOVAR ?)
The problem is that to do this, the quality of mapping is necessary. But, the SAM file obtained after mapping with Bowtie, lacks all alignment fields like MAPQ or QNAME. So, I wonder if using color fasta-space and QV as input in Bowtie would not allow getting these fields like MAPQ (I have no experience with Solid), or maybe I'm doing something wrong.

This data was given to me, and I do not have the opportunity to sequence again using another platform like Illumina (if I could, I would do it without thinking).
Thank you for your response, and any suggestions will be welcome.

Alexis

Last edited by Alexis1; 02-01-2017 at 05:38 AM.
Alexis1 is offline   Reply With Quote
Reply

Tags
bowtie, color-space, mapping, sam file, solid 5500

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:05 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO