Hello There,
I have recently begun analysis of some RNAseq data of Nicotiana benthamiana, under various pathogen type treatments. This dataset was acquired as part of a collaboration that me and my supervisor threw together in a hurry, so that we would have something to work on during the lockdown. As such, I lack a little basic information about the dataset, including strandedness.
having aligned the paired-end illumina sequenced fastq files using HISAT2, sorted the resulting .sam files by name using samtools, and then converted them to .bam files; I attempted to determine if they were strand specific using the infer_experiment.py script of the RSeQC package. However, it gave me an unusual output:
Initially, I thought this might be due to a problem with the .bed file I used, as the N. benthamiana annotation is not thorough, and the .bed file I used was converted from a .gff file, but other scripts of the RSeQC package that require .bed file worked fine with the one I provided.
Looking at the data in IGV (see below) it seems to me that the data is not strand specific, and i'm happy to continue on this basis (unless someone here knows better and can let me know why i'm mistaken). but if anyone here who has experience using the RSeQC package can let me know why I might be getting this output from infer_experiment.py, i'd be very grateful.
I have recently begun analysis of some RNAseq data of Nicotiana benthamiana, under various pathogen type treatments. This dataset was acquired as part of a collaboration that me and my supervisor threw together in a hurry, so that we would have something to work on during the lockdown. As such, I lack a little basic information about the dataset, including strandedness.
having aligned the paired-end illumina sequenced fastq files using HISAT2, sorted the resulting .sam files by name using samtools, and then converted them to .bam files; I attempted to determine if they were strand specific using the infer_experiment.py script of the RSeQC package. However, it gave me an unusual output:
Code:
This is PairEnd Data Fraction of reads failed to determine: 1.0000 Fraction of reads explained by "1++,1--,2+-,2-+": 0.0000 Fraction of reads explained by "1+-,1-+,2++,2--": 0.0000
Looking at the data in IGV (see below) it seems to me that the data is not strand specific, and i'm happy to continue on this basis (unless someone here knows better and can let me know why i'm mistaken). but if anyone here who has experience using the RSeQC package can let me know why I might be getting this output from infer_experiment.py, i'd be very grateful.
Comment