Determining strandedness of .bam file (RSeQC)

cartzo

Junior Member

Join Date: Apr 2020

Posts: 5
- Share
- Tweet
#1

Determining strandedness of .bam file (RSeQC)

04-22-2020, 06:40 AM

Hello There,

I have recently begun analysis of some RNAseq data of Nicotiana benthamiana, under various pathogen type treatments. This dataset was acquired as part of a collaboration that me and my supervisor threw together in a hurry, so that we would have something to work on during the lockdown. As such, I lack a little basic information about the dataset, including strandedness.

having aligned the paired-end illumina sequenced fastq files using HISAT2, sorted the resulting .sam files by name using samtools, and then converted them to .bam files; I attempted to determine if they were strand specific using the infer_experiment.py script of the RSeQC package. However, it gave me an unusual output:

Code:

This is PairEnd Data Fraction of reads failed to determine: 1.0000 Fraction of reads explained by "1++,1--,2+-,2-+": 0.0000 Fraction of reads explained by "1+-,1-+,2++,2--": 0.0000

Initially, I thought this might be due to a problem with the .bed file I used, as the N. benthamiana annotation is not thorough, and the .bed file I used was converted from a .gff file, but other scripts of the RSeQC package that require .bed file worked fine with the one I provided.

Looking at the data in IGV (see below) it seems to me that the data is not strand specific, and i'm happy to continue on this basis (unless someone here knows better and can let me know why i'm mistaken). but if anyone here who has experience using the RSeQC package can let me know why I might be getting this output from infer_experiment.py, i'd be very grateful.
Tags: rnaseq analysis, rnaseq data, rseqc, strand specific
cartzo

Junior Member

Join Date: Apr 2020

Posts: 5
- Share
- Tweet
#2

04-27-2020, 07:50 AM

For anyone whom it may help, I have found a solution to my problem.

I was mistaken in believing that RSeQC works entirely with a bed file converted from gff3 using the gff2bed tool of the BEDOPS package. this tool will convert a gff file to a six or nine column bed file. However, some tools in the RSeQC package require a comprehensive 12 column bed file.

A python script for converting a gff3 file to a 12 column bed file, written by Vipin T Sreedharan of the Memorial Sloan Kettering cancer centre, can be found here: https://github.com/vipints/converter...d_converter.py
Comment

Previous template Next

Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM
Strategies for Sequencing Challenging Samples

by seqadmin

Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
- Channel: Articles
03-22-2024, 06:39 AM

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 29 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Determining strandedness of .bam file (RSeQC)

Comment

Latest Articles

ad_right_rmr

News