Hi,
I have a sam file (bwa) from paired end RNASeq short reads that was aligned to a CDS fasta file. I need to use HTSeq.counts on this sam file, so I need the corresponding GTF file.
I thought I would not be too had to generate a basic GTF file from the original RNA fasta file, but HTSeq does not recognize the ID of any RNA sequence in the SAM file :
Warning: Skipping read 'XXX0654:58235#TGACCA', because chromosome 'gi|155030243|ref|NM_017599.3|', to which it has been aligned, did not appear in the GFF file.
However, this sequence id is present in the GTF file :
gi|155030243|ref|NM_017599.3| ref CDS 1 4580 . + . gene_id "VEZT"; transcript_id "NM_017599.3";
and in the sam header too :
@SQ SN:gi|155030243|ref|NM_017599.3| LN:4580
Where is the mismatch ??
Many thanks for your help.
Emmanuel.
I have a sam file (bwa) from paired end RNASeq short reads that was aligned to a CDS fasta file. I need to use HTSeq.counts on this sam file, so I need the corresponding GTF file.
I thought I would not be too had to generate a basic GTF file from the original RNA fasta file, but HTSeq does not recognize the ID of any RNA sequence in the SAM file :
Warning: Skipping read 'XXX0654:58235#TGACCA', because chromosome 'gi|155030243|ref|NM_017599.3|', to which it has been aligned, did not appear in the GFF file.
However, this sequence id is present in the GTF file :
gi|155030243|ref|NM_017599.3| ref CDS 1 4580 . + . gene_id "VEZT"; transcript_id "NM_017599.3";
and in the sam header too :
@SQ SN:gi|155030243|ref|NM_017599.3| LN:4580
Where is the mismatch ??
Many thanks for your help.
Emmanuel.
Comment