A Basic Guide to RNA-sequencing

Novogene

Registered Vendor

Join Date: May 2024

Posts: 75
- Share
- Tweet
#1

A Basic Guide to RNA-sequencing

07-25-2024, 12:35 PM

Next-generation sequencing (NGS) is the modern, second-revolution and the spearhead of an ever-accelerating field. The NGS technology performs innovative research in a variety of different application areas, such as clinical/biopharma, human genetics/genomics, agriculture, forensics, water testing, complex and infectious disease research and etc. RNA sequencing (RNA-seq) must be the most mature one of NGS. Here is what you need to know to begin using this tool.

Choosing the right RNA-seq

Choosing the right RNA-seq depends on your research question and goals, you first need to understand what type of RNA you are studying. In eukaryotic total RNA, there are various types of RNA molecules, such as:
ribosomal RNA (rRNA) – 80-90% of total eukaryotic RNA is in fact rRNA

transfer RNA (tRNA)

messenger RNA (mRNA) – only 3-7% total eukaryotic RNA represents the protein coding mRNA

non-coding RNA (ncRNA), such as:
long non-coding RNA (lncRNA)

circulatory RNA (circRNA)

micro RNA (miRNA)

So how can you know what type of RNA you wish to study? To do that, you need to consider your research objectives: for instance, some projects are interested ‘only’ in comparing transcriptomic profiles across samples in different experimental conditions (e.g Treatment vs. Control), or between different tissue types, to identify tissue-specific gene expression. Other research groups might be more interested in analyzing a shift in cellular transcriptomic footprints across time to determine the time-scale gene expression patterns.

When it comes to transcript profiling, there are three major questions you would wish to answer. One of them is annotation – a step where you assign function to your RNA molecules. An important aspect to keep in mind is that when analyzing mRNA-seq data originating from eukaryotic organisms, various isoforms are formed in a process called alternative splicing, and you might want to consider sequencing the different splice variants as well. The second thing is quantification of target RNA molecules – crucial for the differential gene expression analysis, where you aim to determine which genes are upregulated/downregulated across your samples. Lastly, one can also perform target prediction/network study – particularly important when working with non-coding RNA molecules, where you aim to investigate the potential interaction between various RNA molecules, and what are the potential targets of these ncRNAs.

Now that we know all this, you might want to ask yourself what sequencing technologies are available? As a sequence service provider, at Novogene we have the most advanced sequencing platforms to cater to diverse sequencing needs. One of these platforms is the NovaSeq6000/X Plus short-read sequencing machine, the most commonly used NGS sequencers which can sequence all types of RNA molecules (up to 500 bp). We also use long-read sequencing Nanopore and PacBio platforms, capable of sequencing up to 10kB in length. However, these platforms can only be used to sequence eukaryotic mRNA molecules containing a polyA tail.

Service Workflow

So how can Novogene help you conduct your research? At Novogene, we have established an end-to-end service workflow and provide our customers with high quality data, publication ready analysis services and offer personalized assistance in collaboration with our highly-skilled team members.
As for the RNA-seq protocol in particular, our services include:
Total RNA extraction
Sample quality check (QC)

Library construction
Library QC

Sequencing
Sequence QC

Bioinformatic analysis

Let’s start from the beginning! How can you prepare your RNA samples? One of the methods we recommend at Novogene to extract total RNA is the TRIzol method which works by maintaining RNA integrity during tissue homogenization, while at the same time disrupting and breaking down cells and cell components. Upon successful RNA extraction, you are ready to send us your samples! We recommend dry ice packaging to do this, and we suggest to consider the amount of dry ice needed for the shipment, as normally 5kg of dry ice is consumed per day.

Upon receipt of the samples, we will perform the sample quality check (QC) using (1) 1% Agarose gel electrophoresis, (2) Nanodrop reading to check for RNA amount and purity, and (3) Agilent2100 to check for RNA Integrating Number (RIN). At Novogene, we deploy each of these three methods to check the quality of the data you send to us, thereby ensuring you work only with the highest quality data.

Once we confirm the quality of the extracted total RNA is good, we can move on with the protocol and perform the library construction step. The polyA enrichment library preparation procedure is most commonly used when working with mRNA-seq data and results in the construction of 250-300 bp insert cDNA library, which means your samples are now ready to be sequenced! In case you wish to analyze additional RNA molecules and not just the mRNAs, the whole transcriptome sequencing protocol might be the best option for you. Additional library preparation steps would need to be performed in this instance, such as, (1) rRNA removal & directional RNA library (to retrieve lncRNA, miRNA and circRNA) and (2) small RNA library preparation, crucial to obtain small RNA molecules. Retrieval of all these non-coding RNAs together with mRNAs might be of interest to you if you are interested in analyzing interactions and regulatory networks between RNA molecules, and the consequence of these processes on the establishment of the (molecular) phenotype.

Once the libraries are prepared, you can move forward with the actual sequencing step. There are various sequencing strategies available, such as the (1) single end (SE) sequencing, and (2) paired end (PE) sequencing methods. The choice of an appropriate sequencing strategy again depends on your research question and the type of RNA molecules you work with, and we recommend:
PE150 bp strategy for
mRNA, lncRNA & circRNA-seq datasets

SE50bp strategy for
sRNA-seq, and

PE150 & SE50 strategies for
whole transcriptome sequencing

Setting up your analysis & Results overview

Once you receive your RNA-seq data back from us, you can move forward with the development of a bioinformatics pipeline, or we can analyze the data for you and provide you with publication ready high-resolution visualizations! In case you want us to develop a bioinformatics pipeline for you, you would only need to (1) name samples and groups in your study, (2) choose an appropriate genome as a reference, and (3) design your comparisons.

Here is a general overview of the data analysis protocol utilized to analyze mRNA-seq data: (1) Data quality check – a step where low confidence nucleotide base-calls can be identified and removed. You can then (2) Map your data to reference genome – a suitable genome can be chosen either from our Novogene database which contains a wide variety of organisms, or from other publicly available genomic repositories, such as the NCBI/UCSC/Ensembl databases. In this step, we will provide (Table 1) tables with read statistics that contain the percentage of reads that mapped against the reference genome, (Figure 1) a piechart – to inform on the percentage of reads that mapped against intron, exon or intergenic regions of the reference genome, and (Figure 2) a graph which shows the distribution of mapped reads across chromosomes.

The next step is the (3) Differential gene expression analysis, where we will visualize the distribution of differentially expressed genes across your samples using high-resolution and publication ready volcano plots and heatmaps. To annotate a function to your genes, one can then perform a (4) Functional analysis by mapping the reads against publicly available protein databases, such as the Gene Ontology db (GO enrichment analysis), Kyoto Encyclopedia of Genes and Genomes db (KEGG enrichment analysis), or perform a Gene Set Enrichment Analysis (GSEA) and Protein-Protein interaction studies. Lastly, (5) Structure analysis can also be performed to identify Single Nucleotide Polymorphisms (SNPs), investigate differences in the distribution of indels and deletions across samples (InDel analysis), or analyze splice isoforms (Alternative splicing analysis).

Table 1. Summary of mapping rate

Figure 2. Read distribution on reference genome

Figure 3. Mapped reads on chromosomes

About Novogene

Novogene is a leading provider of genomic services and solutions with cutting edge NGS and bioinformatics expertise, and has the largest sequencing capacity in the world. We provide a variety of services to our clients, such as: (1) DNA services (e.g human genome resequencing /de novo sequencing, microbial sequencing), (2) transcriptomic services (messenger RNA (mRNA) sequencing, non-coding RNA (ncRNA) sequencing, isoform sequencing, single-cell RNA sequencing), (3) epigenetic services (whole genome bisulfite sequencing (WGBS), ChIP sequencing, ATAC sequencing, RIP sequencing) and (4) other services (premade library sequencing, customized analysis, and etc.).

Eager to learn more? Check out our previous post for more insights!

What You Can Explore with Non-coding RNA Data

WGS vs WES: Which Genetic Sequencing Method is Right for You?

Expanding Horizons in Genomic Research with Long-Read Sequencing

Last edited by Novogene; 07-25-2024, 12:48 PM.
Tags: data analysis, rna sequencing, rna-seq workflow

Previous template Next

Advancing Precision Medicine for Rare Diseases in Children

by seqadmin

Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
- Channel: Articles
12-16-2024, 07:57 AM
Recent Advances in Sequencing Technologies

by seqadmin

Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

Long-Read Sequencing
Long-read sequencing has seen remarkable advancements,...
- Channel: Articles
12-02-2024, 01:49 PM

Topics	Statistics	Last Post
Evaluating Genome Sequencing for ECMO Patients in the NICU by seqadmin Started by seqadmin, 12-17-2024, 10:28 AM	0 responses 33 views 0 likes	Last Post by seqadmin 12-17-2024, 10:28 AM
New Genetic Toolkit Refines Studies on Gene Function and Disease by seqadmin Started by seqadmin, 12-13-2024, 08:24 AM	0 responses 49 views 0 likes	Last Post by seqadmin 12-13-2024, 08:24 AM
Study Links Brain Mechanism to Emotional Responses in Animals and Humans by seqadmin Started by seqadmin, 12-12-2024, 07:41 AM	0 responses 34 views 0 likes	Last Post by seqadmin 12-12-2024, 07:41 AM
Study Identifies Ribosomal RNA Fingerprints as Early Cancer Biomarkers by seqadmin Started by seqadmin, 12-11-2024, 07:45 AM	0 responses 46 views 0 likes	Last Post by seqadmin 12-11-2024, 07:45 AM

Seqanswers Leaderboard Ad

Announcement

A Basic Guide to RNA-sequencing

Latest Articles

ad_right_rmr

News