SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
GATK sample/library/lane meaning in BAM read group @RG Sylphide Bioinformatics 6 05-27-2014 08:20 AM
converting bam files to non-normalized read counts lpn Bioinformatics 4 10-09-2012 07:52 PM
adding platform information to a bam file efoss Bioinformatics 2 09-12-2011 10:22 AM
Consensus part from sequence read(fastq) and align(BAM) files culmen Bioinformatics 5 12-21-2010 03:57 AM
Adding read group and Platform information sbaheti Bioinformatics 2 09-25-2010 10:47 AM

Reply
 
Thread Tools
Old 05-18-2011, 04:10 AM   #21
jyli
Member
 
Location: North Carolina

Join Date: Nov 2008
Posts: 21
Default

Quote:
Originally Posted by lh3 View Post
You may try "samtools merge", using options -r and -h. You write your @RG header lines in a file provided to -h; -r will add RG:Z: tag to each of the alignment, based on file names.

EDIT: for an example:

http://sourceforge.net/apps/mediawik...rged_alignment

I posted this on another thread, but

-r STR read group header line such as `@RG\tID:foo\tSM:bar'[null]

gave me error: " malformated @RG line"

Can you please help?
jyli is offline   Reply With Quote
Old 07-01-2011, 10:33 PM   #22
Michael.James.Clark
Senior Member
 
Location: Palo Alto

Join Date: Apr 2009
Posts: 213
Default

Quote:
Originally Posted by jyli View Post
I posted this on another thread, but

-r STR read group header line such as `@RG\tID:foo\tSM:bar'[null]

gave me error: " malformated @RG line"

Can you please help?
It's probably the "\t". I've had trouble with that before.

Probably best to use the Picard tool for it.
__________________
Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
Projects: U87MG whole genome sequence [Website] [Paper]
Michael.James.Clark is offline   Reply With Quote
Old 07-02-2011, 11:17 PM   #23
Seq84
Member
 
Location: Italy

Join Date: Feb 2011
Posts: 19
Default

Quote:
`@RG\tID:foo\tSM:bar'
Maybe is the `quote. Try to copy and paste this string:

'@RG\tID:foo\tSM:bar'

Let me know.
Seq84 is offline   Reply With Quote
Old 09-04-2012, 08:37 PM   #24
aforntacc
Member
 
Location: italy

Join Date: Jun 2011
Posts: 48
Default

Quote:
Originally Posted by freeseek View Post
@Michael.James.Clark the following two lines of bash code:
Code:
echo -e "@RG\tID:ga\tSM:hs\tLB:ga\tPL:Illumina" > rg.txt
samtools view -h ga.bam | cat rg.txt - | awk '{ if (substr($1,1,1)=="@") print; else printf "%s\tRG:Z:ga\n",$0; }' | samtools view -uS - | samtools rmdup - - | samtools rmdup -s - aln.bam
should add to the bam file the read group information in the same way samtools merge adds the read group information to the two bam files as described by javijevi. The idea is to unpack the bam file, add the read group header, add the read group information to every read, repack the file, and remove duplicates. Again, remove duplicates only if the coverage is not too deep.

hi please i got this error, how can i resolve it? i have added the readgroup bam files and used samtools to merge them but when i run the somaticindel detector from GATK it will give me the error below.
here are the commands that i used in adding the read group and merge the bam files
-rh rgmt.txt - genome_110506_SN13.bam genome_110506_SN132.bam genome_110506_SN132_A.bam > newmut.bam
and here is the GATK command i used for the somaticindeldetector
elendin@elendin-HP-Pavilion-dv6700-Notebook-PC:~/analysis of rnaseq bamfiles$ java -jar GenomeAnalysisTK.jar -R VitisVinifera.fasta -T SomaticIndelDetector -o indels.vcf -verbose indels.txt -I:normal wt.bam -I:tumor newmut.bam
and here is the error below
MESSAGE: SAM/BAM file SAMFileReader{/home/elendin/analysis of rnaseq bamfiles/newmut.bam} is malformed: Read HWI-ST132_0461:3:2201:1211:140854#GTCCTA is either missing the read group or its read group is not defined in the BAM header, both of which are required by the GATK. Please use http://www.broadinstitute.org/gsa/wi...laceReadGroups to fix this problem
##### ERROR ------------------------------------------------------------------------------------------

please help me
thanks a lot
aforntacc is offline   Reply With Quote
Old 02-15-2013, 12:08 PM   #25
cfrias
Junior Member
 
Location: barcelona

Join Date: Jan 2011
Posts: 5
Unhappy Add read groups to bam files using bwa-0.6.2

Hi everybody,


I tried to add the read groups to bam files without successful.

I'm using bwa (bwasw 454 reads) and I have tried with the command merge but don't work.

I have read the other post http://seqanswers.com/forums/showthread.php?t=4180
and http://sites.duke.edu/rainbowblog/20...p-information/

But I still couldn't add the read groups.

Please Can someone help me?
Thanks in advance

Cris

Last edited by cfrias; 02-15-2013 at 12:24 PM. Reason: I tried the solution!!!
cfrias is offline   Reply With Quote
Old 02-15-2013, 12:19 PM   #26
wjeck
Member
 
Location: Chapel Hill, NC

Join Date: Mar 2009
Posts: 39
Default

I think I switched to using Picard tools AddOrReplaceReadGroups function.

Try looking here:

http://seqanswers.com/forums/showthread.php?t=11887

Haven't had to do this in a while, though, since I started putting read groups in at the beginning, during alignment. I believe BWA now does that with proper use of the alignment option. The best way to solve this problem is to make sure it doesn't happen in the first place.
wjeck is offline   Reply With Quote
Old 02-15-2013, 12:26 PM   #27
cfrias
Junior Member
 
Location: barcelona

Join Date: Jan 2011
Posts: 5
Default

OK, It's Friday nigth... But I try the solution

With the Brugger's script!!
THANK YOU VERY MUCH!!!
cfrias is offline   Reply With Quote
Old 02-15-2013, 12:28 PM   #28
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

I wonder if these easiest thing would be for software that insists on Read Groups instead provided a parameter to ignore read groups. This Read Groups thing has turned out to be more of a hassle than a benefit. Software should be more robust.
Richard Finney is offline   Reply With Quote
Old 02-15-2013, 12:45 PM   #29
cfrias
Junior Member
 
Location: barcelona

Join Date: Jan 2011
Posts: 5
Default

Thank very much Wjeck and Richard!!!

Yes, Richard I think the same.
Always I read the manuals, but the useful information is in this forum because people
have similars problems.

I am only want to add the read groups to use GATK to improve the aligments from bwa bwasw.
So, I need to remove duplicates and I have a pool of reads.
For this purpose I need to put the read groups In order to avoid remove similar reads from differents individuals.
(duplicated read is a read that have the same maping coordinates and the same CIGAR string,
it isn't?)

Cris
cfrias is offline   Reply With Quote
Old 04-20-2013, 05:41 AM   #30
Clare S
Junior Member
 
Location: Melbourne, Australia

Join Date: Jan 2010
Posts: 5
Default

Quote:
Originally Posted by wjeck View Post
I think, tough this is not with any authority, that

ID = id name for the readgroup
SM = sample name
LB = label? dunno about this one
PL = platform

These are not currently standardized (I think) but ARE used by the Broad GATK, which means getting them right may be important for your pipeline
LB is library.

For many tools the really important field is ID, which must be unique to the read group. Reads are considered to be from different experimental conditions if they have different read groups. So we usually form our readgroup IDs by concatenating all the relevant information that uniquely identifies the experimental conditions (run/flowcell, lane, etc).
Clare S is offline   Reply With Quote
Old 06-26-2014, 03:20 PM   #31
pliang
Junior Member
 
Location: Canada

Join Date: Aug 2009
Posts: 9
Default

Hello, Wjeck and others on this blog:

Picard AddOrReplaceReadGroups does not seem to work for me. Below is the command I used and the first part of bam file it generated. No @RG was added.

Am I missing anything? Your help is greatly appreciated.


java -Xmx4g -XX:ParallelGCThreads=12 -jar /work/nrap1100/bin/picard-tools-1.78/AddOrReplaceReadGroups.jar I=Sample_RS-01812720/merged.bam O=A1_M1.bam RGID=null RGLB=$d RGPL=Illumina RGSM=A1_M1 RGPU=TAAGGCG

samtools view A1_M1.bam |head
8LSZMS1:104:C4KJNACXX:1:1203:8209:39039 145 chr10 3100000 37 101M chr12 59597178 0 AGAATTCTCACCTGAGAAATACCGAATGGCAGAGAAACACCTGAATAAAATGTTCAACATCCTTAATCATCAGGGAAATGCAAATCAAAACAACACTGAGA EDDEECEEEDFEFFFHHHHHHIJJJJJJJJJJIJIIJIHJJIJJIGIJJIIJIIGIJJIJJJJJJJIJJIJJJJJJJJJJJJJJIJJJHHHHHFEFDF@BB X0:i:1 X1:i:0 MD:Z:101 XG:i:0 AM:i:0 NM:i:0 SM:i:37 XM:i:0 XN:i:1 XO:i:0 XT:A:U
8LSZMS1:104:C4KJNACXX:1:2105:3168:69401 99 chr10 3100097 57 101M = 3100236 240 GAGATTCCACTTCACTCCAGTTAGAATGGCTAAGATCAAAAACTCAGGTGACAACAGATGTTGGCGAGGATGTGGAGAAAGGGGAACACTCCTCCATTGTT CC@FFFFFHHHHHJJJJJJJIIJGIIJJJJJJJJIIIIJJJJJGGIGHBGHIJJJJJJJIIGHIJJJJHFFFDEFEDECCCBDDDDDDDACDDDDDDDDED X0:i:1 X1:i:2 XA:Z:chr4,+72093156,101M,1;chr7,+144726933,101M,1; MD:Z:101 XG:i:0 AM:i:20 NM:i:0 SM:i:20 XM:i:0 XO:i:0 XT:A:U
8LSZMS1:104:C4KJNACXX:1:1212:1749:88430 99 chr10 3100168 29 101M = 3100238 171 GTGGAGAAAGGGGAACACTCCTCCATTGTTGGTGGGATTGCAAGCTTGTACAACCACTCTGGAAATCAGTCTGGCGGTTCCTCAGAAAATTGGACATAGTA CCCFFFFFHGGHHJJIJJJJIJJJJJJJIJJJGGIIGIIJIIGIJJJJGHIJJJJJJJJJJHHHHHFFFFFFEEEBD@BDDDDDDDDDDDDDDDDDDDDEE X0:i:452 MD:Z:101 XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 XT:A:R
pliang is offline   Reply With Quote
Old 06-26-2014, 03:44 PM   #32
blancha
Senior Member
 
Location: Montreal

Join Date: May 2013
Posts: 367
Default

Why in the world would you use 12 threads to add or replace read groups?

-XX:ParallelGCThreads=12

More importantly, you need to specify the parameter -H (or -h) to view the header.

samtools view -H A1_M1.bam
or
samtools view -h A1_M1.bam | more

Last edited by blancha; 06-26-2014 at 04:30 PM.
blancha is offline   Reply With Quote
Old 06-26-2014, 05:38 PM   #33
pliang
Junior Member
 
Location: Canada

Join Date: Aug 2009
Posts: 9
Default

Thanks for your quick reply, Blancha! The use 12 threads was unintended and was really unnecessary, and I know the use -H/h in samtools view, which just displays the header. But I was talking about the actual RG tags added to each alignment line, which was missing from mine bam file generated by picard AddOrReplaceReadGroups. I am looking into the perl script posted by Brugge.
pliang is offline   Reply With Quote
Old 06-26-2014, 06:20 PM   #34
blancha
Senior Member
 
Location: Montreal

Join Date: May 2013
Posts: 367
Default

I'm not an expert on the SAM format, and it's not a particularly interesting subject, to be honest.
Some of the RG tags only appear in the header, I believe.
In my BAM files, only the RGID appears on each alignment line, but you set it to null.
RGLB, RGPL, RGSM and RGPU only appear in the header.

If you have trouble falling asleep, you can read the SAM format specification.
http://samtools.github.io/hts-specs/SAMv1.pdf
Only ID is specified as being used in the RG tags of alignment records.

If I'm wrong, feel free to correct me.

Last edited by blancha; 06-26-2014 at 06:28 PM.
blancha is offline   Reply With Quote
Old 02-23-2015, 02:30 AM   #35
kapr0007
Junior Member
 
Location: sweden

Join Date: Oct 2014
Posts: 5
Default

Hello all,

Thank you all for your posts,
first I would like to tell you all, I am trying to add RG tags and sample ID to a list of sorted bam files, i have tried all the above mentioned scripts. i was able to add the RG tags to all my files, later calling with freebayes for SNP, my ouput vcf file has only the header part
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT L6-6

the rest is empty. could some one help me with this, all i need is after i call for SNP, i would like to have all the genotypes labeled with the specific sample IDs.

Thank you all
kapr0007 is offline   Reply With Quote
Old 02-23-2015, 02:47 AM   #36
sarvidsson
Senior Member
 
Location: Berlin, Germany

Join Date: Jan 2015
Posts: 137
Default

Did you merge into one bam file (and if so, with which tool) or supply freebayes with several BAM files?

Could you post the output from

Code:
samtools view -H file.bam | grep "^@RG"
for either your merged file (if you have one) or a few of your single sample files and the exact freebayes call you used?
sarvidsson is offline   Reply With Quote
Old 02-23-2015, 03:51 AM   #37
kapr0007
Junior Member
 
Location: sweden

Join Date: Oct 2014
Posts: 5
Default

Thank you for your reply,

I merged into one bam file before i suppled to freebayes, the above grep for my bam file looks empty, but when it view it has rg tags RG:Z:f11-12

thanks
kapr0007 is offline   Reply With Quote
Old 02-23-2015, 04:14 AM   #38
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

What tool did you use to merge the BAM files and what do you get if you perform the grep command that sarvidsson posted on one of the unmerged BAM files? The problem you're running into is due to not having read group information in the BAM header. Most tools will read this to create a dictionary of valid read groups that alignments can have.
dpryan is offline   Reply With Quote
Old 02-23-2015, 04:14 AM   #39
sarvidsson
Senior Member
 
Location: Berlin, Germany

Join Date: Jan 2015
Posts: 137
Default

Quote:
Originally Posted by kapr0007 View Post
Thank you for your reply,

I merged into one bam file before i suppled to freebayes, the above grep for my bam file looks empty, but when it view it has rg tags RG:Z:f11-12

thanks
Then something went wrong in merging those BAMs (what did you use for this step?); your header got truncated. If the individual BAM files have intact headers - with the read groups in there, you can simply feed Freebayes with several BAM files.
sarvidsson is offline   Reply With Quote
Old 02-23-2015, 04:36 AM   #40
kapr0007
Junior Member
 
Location: sweden

Join Date: Oct 2014
Posts: 5
Default

Hi, this is the command i used

perl -e ’print "@RG\tID:FP\tSM:F11-12\tLB:FP\tPL:Illumina\n@RG\tID:FP\tSM:L6-6\tLB:FP\tPL:Illumina\n"’ > rg1.txt
samtools merge -rh rg1.txt merged_f11-l6 l6-6.bam f11-12.bam
kapr0007 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:39 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO