SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
sam to bam conversion error, no @SQ lines in the header, missing header? efoss Bioinformatics 17 12-03-2015 04:28 AM
samtools sam to bam header mattbawn Bioinformatics 14 06-06-2014 06:23 AM
SAM/BAM header in Galaxy vl80 Bioinformatics 0 04-06-2013 04:20 PM
the header of SAM and BAM missing dongshenglulv Bioinformatics 5 10-23-2011 11:44 PM
sam/bam header lines keebs42 Bioinformatics 1 08-21-2009 11:25 AM

Reply
 
Thread Tools
Old 09-18-2015, 01:50 PM   #1
cmccabe
Senior Member
 
Location: chicago

Join Date: Jul 2012
Posts: 354
Default strip header off sam header using bam

I am trying to remove the SAM header from a bam file. Using the below code I get:

Code:
 samtools view -H IonXpress_009_150603.bam > header.sam | cat header.sam bedfile.bed > new_file.bed
current output:
Code:
@HD	VN:1.4	GO:none	SO:coordinate
@SQ	SN:chr1	LN:249250621
@SQ	SN:chr2	LN:243199373
@SQ	SN:chr3	LN:198022430
@SQ	SN:chr4	LN:191154276
@SQ	SN:chr5	LN:180915260
@SQ	SN:chr6	LN:171115067
@SQ	SN:chr7	LN:159138663
@SQ	SN:chr8	LN:146364022
@SQ	SN:chr9	LN:141213431
@SQ	SN:chr10	LN:135534747
@SQ	SN:chr11	LN:135006516
@SQ	SN:chr12	LN:133851895
@SQ	SN:chr13	LN:115169878
@SQ	SN:chr14	LN:107349540
@SQ	SN:chr15	LN:102531392
@SQ	SN:chr16	LN:90354753
@SQ	SN:chr17	LN:81195210
@SQ	SN:chr18	LN:78077248
@SQ	SN:chr19	LN:59128983
@SQ	SN:chr20	LN:63025520
@SQ	SN:chr21	LN:48129895
@SQ	SN:chr22	LN:51304566
@SQ	SN:chrX	LN:155270560
@SQ	SN:chrY	LN:59373566
@SQ	SN:chrM	LN:16569
@RG	ID:X28LU.IonXpress_008	PL:IONTORRENT	PU:proton/P1.1.17/IonXpress_008	FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG	DT:2015-09-02T17:20:14-0700	SM:BC8 pool I	PG:tmap	KS:TCAGTTCCGATAACGAT	CN:TorrentServer/Proton
@RG	ID:X28LU.IonXpress_008.1	PL:IONTORRENT	PU:proton/P1.1.17/IonXpress_008	FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG	DT:2015-09-02T16:29:13-0700	SM:BC8 pool I	PG:tmap	KS:TCAGTTCCGATAACGAT	CN:TorrentServer/Proton
@RG	ID:X28LU.IonXpress_008.10	PL:IONTORRENT	PU:proton/P1.1.17/IonXpress_008	FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG	DT:2015-09-02T14:03:04-0700	SM:BC8 pool I	PG:tmap	KS:TCAGTTCCGATAACGAT	CN:TorrentServer/Proton
@RG	ID:X28LU.IonXpress_008.11	PL:IONTORRENT	PU:proton/P1.1.17/IonXpress_008	FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG	DT:2015-09-02T13:33:05-0700	SM:BC8 pool I	PG:tmap	KS:TCAGTTCCGATAACGAT	CN:TorrentServer/Proton
@RG	ID:X28LU.IonXpress_008.12	PL:IONTORRENT	PU:proton/P1.1.17/IonXpress_008	FO:TACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACGTCTGAGCATCGATCGATGTACAGCTACGTACG	DT:2015-09-02T13:32:53-0700	SM:BC8 pool I	PG:tmap	KS:TCAGTTCCGATAACGAT	CN:TorrentServer/Proton
@RG	ID:X28LU.IonXpress_008.13	PL:IONTORRENT	PU:proton/P1.1.17/IonXpress_008
Desired output:

Code:
@HD	VN:1.4	GO:none	SO:coordinate
@SQ	SN:chr1	LN:249250621
@SQ	SN:chr2	LN:243199373
@SQ	SN:chr3	LN:198022430
@SQ	SN:chr4	LN:191154276
@SQ	SN:chr5	LN:180915260
@SQ	SN:chr6	LN:171115067
@SQ	SN:chr7	LN:159138663
@SQ	SN:chr8	LN:146364022
@SQ	SN:chr9	LN:141213431
@SQ	SN:chr10	LN:135534747
@SQ	SN:chr11	LN:135006516
@SQ	SN:chr12	LN:133851895
@SQ	SN:chr13	LN:115169878
@SQ	SN:chr14	LN:107349540
@SQ	SN:chr15	LN:102531392
@SQ	SN:chr16	LN:90354753
@SQ	SN:chr17	LN:81195210
@SQ	SN:chr18	LN:78077248
@SQ	SN:chr19	LN:59128983
@SQ	SN:chr20	LN:63025520
@SQ	SN:chr21	LN:48129895
@SQ	SN:chr22	LN:51304566
@SQ	SN:chrX	LN:155270560
@SQ	SN:chrY	LN:59373566
@SQ	SN:chrM	LN:16569   (the below lines are the contents of bedfile)
chr1	955542	955763	chr1:955542-955763	+	AGRN:exon.1
chr1	957570	957852	chr1:957570-957852	+	AGRN:exon.2
chr1	976034	976270	chr1:976034-976270	+	AGRN:exon.2;AGRN:exon.3;AGRN:exon.4
Thank you .
cmccabe is offline   Reply With Quote
Old 09-18-2015, 04:06 PM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,061
Default

This should do what you want.

Code:
$ samtools view -H IonXpress_009_150603.bam | cat - bedfile.bed > new_file.bed
Edit: This won't strip the @RG lines (though that can done easily). You should consider Peter's point below.

Last edited by GenoMax; 09-19-2015 at 01:05 AM.
GenoMax is offline   Reply With Quote
Old 09-19-2015, 12:42 AM   #3
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

If you remove the read groups (@RG) from the header, you should also remove them from the reads...

Last edited by maubp; 09-19-2015 at 12:42 AM. Reason: typo
maubp is offline   Reply With Quote
Old 09-19-2015, 05:01 AM   #4
cmccabe
Senior Member
 
Location: chicago

Join Date: Jul 2012
Posts: 354
Default

So something like (hopefully it's close):

Code:
 samtools view -H IonXpress_009_150603.bam | sed '/^@RG/d'| cat - bedfile.bed > new_file.bed
Thank you .

EDIT: this command is close (probably not the best), but why isn't the bed file being used like it was before?

Code:
samtools view -H /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_008_150902.bam | sed '/^@RG/d' | sed 's/{"flow.*//' | sed 's/--.*//' | sed 's/@PG.*//' | cat - /home/cmccabe/Desktop/NGS/bed/modified_sorted_IDT.bed > /home/cmccabe/Desktop/NGS/bed/new_file.bed
new_file.bed output:
Code:
@HD	VN:1.4	GO:none	SO:coordinate
@SQ	SN:chr1	LN:249250621
@SQ	SN:chr2	LN:243199373
@SQ	SN:chr3	LN:198022430
@SQ	SN:chr4	LN:191154276
@SQ	SN:chr5	LN:180915260
@SQ	SN:chr6	LN:171115067
@SQ	SN:chr7	LN:159138663
@SQ	SN:chr8	LN:146364022
@SQ	SN:chr9	LN:141213431
@SQ	SN:chr10	LN:135534747
@SQ	SN:chr11	LN:135006516
@SQ	SN:chr12	LN:133851895
@SQ	SN:chr13	LN:115169878
@SQ	SN:chr14	LN:107349540
@SQ	SN:chr15	LN:102531392
@SQ	SN:chr16	LN:90354753
@SQ	SN:chr17	LN:81195210
@SQ	SN:chr18	LN:78077248
@SQ	SN:chr19	LN:59128983
@SQ	SN:chr20	LN:63025520
@SQ	SN:chr21	LN:48129895
@SQ	SN:chr22	LN:51304566
@SQ	SN:chrX	LN:155270560
@SQ	SN:chrY	LN:59373566
@SQ	SN:chrM	LN:16569

Last edited by cmccabe; 09-19-2015 at 06:03 AM. Reason: added new command
cmccabe is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:07 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO