Seqanswers Leaderboard Ad

**dpryan** · 05-21-2013, 12:59 AM

That's odd, you might run the following on your reference fasta file to see if this is expected or not:

Code:

grep ">" reference_genome.fa

If ">15" pops up, then this is normal, though it'd be odd to have that and chr9 in the same fasta file. bismark does play around a bit with contig names, but something being messed up in the code dealing with that should result in different behaviour.

**fkrueger** · 05-21-2013, 01:42 AM

Originally posted by dpryan View Post

That's odd, you might run the following on your reference fasta file to see if this is expected or not:

Code:

grep ">" reference_genome.fa

If ">15" pops up, then this is normal, though it'd be odd to have that and chr9 in the same fasta file. bismark does play around a bit with contig names, but something being messed up in the code dealing with that should result in different behaviour.

Bismark takes whatever the fasta files had in the header until it hits the first white space, if you get '15' and 'chr9' in the output I would assume that these entries looked like '>15' and '>chr9' in the fasta files you used for the genome indexing process. I think it does replace '|' characters with underscores, but it would certainly not add or remove 'chr'.

**serenaliao** · 05-21-2013, 09:18 AM

Originally posted by fkrueger View Post

Bismark takes whatever the fasta files had in the header until it hits the first white space, if you get '15' and 'chr9' in the output I would assume that these entries looked like '>15' and '>chr9' in the fasta files you used for the genome indexing process. I think it does replace '|' characters with underscores, but it would certainly not add or remove 'chr'.

Thanks fkrueger,

You are right. This happened to my FASTA file.(some are chr<number> and some are <number>) Is there any convenient way to add "chr" before the chromosome number in SAM file(third column) if there is no chr? Thanks!

**serenaliao** · 05-21-2013, 09:42 AM

Originally posted by serenaliao View Post

Thanks fkrueger,

You are right. This happened to my FASTA file.(some are chr<number> and some are <number>) Is there any convenient way to add "chr" before the chromosome number in SAM file(third column) if there is no chr? Thanks!

Just to follow up, I used awk '{if($3!~/^chr/){$3="chr"$3} print($0)}' filename. Does this sound reasonable?

**fkrueger** · 05-21-2013, 11:51 AM

I am no expert with awk but it looks ok, should be easy enough to test (maybe on a few lines first). Any clues why your fasta files have mixed chromosome names?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 23 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 21 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Wierd SAM format chromosome column

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News