Seqanswers Leaderboard Ad

**ETHANol** · 04-02-2012, 07:11 AM

I would try to convert your sam files to bam files, sort and convert back to sam files which should add a header (all with samtools) and see if you still have a problem.

Not saying it would work, but probably worth a try since it would be very easy.

**nir** · 04-02-2012, 07:22 AM

Thank you so much for the quick reply.

I'm wondering -- The sam files are already lexicographically sorted. Why should re-sorting make a difference? (is there some other technical issue I'm overlooking?).
I'm not sure this is a header issue since I tried running with/ without a header (maybe there's something wrong with my header format?).

Thanks again!

**ETHANol** · 04-02-2012, 08:45 AM

1. Because do you really know your file is correctly sorted for sure? That's a lot of lines to read through.
2. Because if samtools has a problem reading your sam file it might give you a better idea of what is wrong with it.
3. Because Tophat told you your file wasn't sorted, maybe it's telling the truth.
4. Because there is a small chance it might solve your problem.
5. When stuff isn't working, I just throw everything at it I can think of.

**nir** · 04-02-2012, 11:06 AM

Thanks again for the quick reply. I really appreciate it.

I went through the file (with a simple script) and verified the chromosome order (it's properly sorted).

From the error msg one can see that cufflinks complains since he sees "chrX" after "chrM" (which is a bit weird). What I then found is that things work just fine if I remove the chrM lines.

I'm not crazy about this workaround though. Any inputs?

**Jon_Keats** · 04-02-2012, 11:56 AM

The source of the issue is that you added "chr" to the NCBI reference file. The chr addition is the expect format for UCSC files, which use chrMT not chrM. Ultimately, there is no reason to add chr to the reference file in the majority of situations.

**ETHANol** · 04-02-2012, 12:01 PM

I filter out the chrM reads as well because HTseq-count doesn't like it that the GTF file I use (iGenomes) doesn't contain any chrM genes. It seems that chrM messes up more then one RNA-seq workflow. I don't see it as an issue. It's just one line of code that runs pretty quickly to get rid of the chrM reads. Does your GTF file contain any chrM genes. If it does't then it is no loss.

Glad to be of absolutely no help and suggest you spend some of your time on something that didn't work. Ha ha.

**ETHANol** · 04-02-2012, 12:03 PM

Originally posted by Jon_Keats View Post

The source of the issue is that you added "chr" to the NCBI reference file. The chr addition is the expect format for UCSC files, which use chrMT not chrM. Ultimately, there is no reason to add chr to the reference file in the majority of situations.

Wow, that is just confusing. How hard would it be for NCBI, Ensembl and UCSC to just agree on chromosome names.

**Jon_Keats** · 04-02-2012, 12:10 PM

NCBI and ENSEMBL do, while UCSC is fixated on messing people up with "chr", "MT versus M", and the whole 1-based and 0-based annotation issues (at least in my opinion). One good rule of thumb, is to absolutely stick with one source of annotation and what ever you do, don't mix and match UCSC stuff with other sources.... Sorry just my pet peeve after finding Illumina mis-mapped SNPs back in the day with iGenome because someone used UCSC dpSNP entries which shift the mapping by 1bp and seeing many of these issues.

Ethan,

on a separate note do you track PF rates on your CHIP-seq runs? We have some odd correlations with IP method.

**ETHANol** · 04-02-2012, 12:24 PM

I can say I take a glance but that is about it. Usually, I just look and say that seems reasonable and move on. There's always something more to look at, I suppose.

**vyellapa** · 04-02-2012, 01:22 PM

If you believe sort order is the real issue, running Picards ReorderSam.jar after a SamSort.jar would work.

**kesner** · 06-26-2012, 11:03 PM

cuffdif problems with chrM and chrX not in sorted order

I made a 3 record sam file with 1 chr1, chrX, and chrM record. When chrM comes prior to chrX I get the same sort order problem. Changing chrM to chrMT does not fix the problem.

**kesner** · 06-26-2012, 11:07 PM

Also,
If I change chrM to a chrZ and place it after chrX there is no problem.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 31 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 33 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Sort problem (Tophat --> Cufflinks)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News