Seqanswers Leaderboard Ad

**ronaldrcutler** · 06-16-2016, 09:38 AM

Also, another related question would be: is the SAM alignments output by hisat2 sorted?

**dpryan** · 06-16-2016, 09:57 AM

htseq-count wants you to "samtools sort -n", not "samtools sort". The difference is the cause of the differing results. You do not need to sort the output of hisat2 before giving it to htseq-count.

Note that since you coordinate sorted the file and then told htseq-count that it was name sorted that the results for that are...inaccurate. The file with the smaller number of processed alignments is the correct one.

**ronaldrcutler** · 06-16-2016, 10:01 AM

Thanks for the clarification, this will save a lot of time!

**ronaldrcutler** · 06-28-2016, 09:51 AM

Originally posted by dpryan View Post

You do not need to sort the output of hisat2 before giving it to htseq-count.

When examining the head of some SAM files I have been working with output from hisat2, I noticed that the head contains this line:

Code:

@HD	VN:1.0	SO:unsorted

I know you said hisat2 outputs sorted SAM files, so what does this mean?

**GenoMax** · 06-28-2016, 05:28 PM

Originally posted by ronaldrcutler View Post

When examining the head of some SAM files I have been working with output from hisat2, I noticed that the head contains this line:

Code:

@HD	VN:1.0	SO:unsorted

I know you said hisat2 outputs sorted SAM files, so what does this mean?

You could use instead featureCounts. It is much faster and will sort the bam/sam files if needed.

Looks like HISAT2's output is unsorted.

**ronaldrcutler** · 07-01-2016, 09:08 AM

To follow up: sorting the sam files removed this error that I had in all of them:

Code:

Warning: Malformed SAM line: MRNM != '*' although flag bit &0x0008 set
Warning: Malformed SAM line: RNAME != '*' although flag bit &0x0004 set
Warning: Malformed SAM line: MRNM == '=' although read is not aligned.

But not this error, which was similar in all of them (however, I just ignored it):

Code:

Warning: Read ACB052:253:C76YKACXX:2:1101:2245:1957 claims to have an aligned mate which could not be found in an adjacent line.

When comparing the sorted and unsorted files using the 'diff' command, there were no differences!

Topics	Statistics	Last Post
The Role of Spliceosomes in RNA Splicing and Genome Evolution by seqadmin Started by seqadmin, Today, 07:03 AM	0 responses 10 views 0 likes	Last Post by seqadmin Today, 07:03 AM
A Closer Look at the Enigmatic Genomes of Oikopleura dioica by seqadmin Started by seqadmin, 05-10-2024, 06:35 AM	0 responses 31 views 0 likes	Last Post by seqadmin 05-10-2024, 06:35 AM
Advanced Epigenome Editing Platform Explores Gene Regulation Mechanisms by seqadmin Started by seqadmin, 05-09-2024, 02:46 PM	0 responses 41 views 0 likes	Last Post by seqadmin 05-09-2024, 02:46 PM
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, 05-07-2024, 06:57 AM	0 responses 33 views 0 likes	Last Post by seqadmin 05-07-2024, 06:57 AM

Seqanswers Leaderboard Ad

Announcement

Sorting SAM files

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News