Seqanswers Leaderboard Ad

**dpryan** · 03-05-2013, 09:31 AM

Just to clarify, you mean small transcripts rather than small peptides and sequencing nucleotides rather than amino acids, yes? You switch between protein and DNA/RNA nomenclature throughout, so it's difficult to be absolutely certain whether you're doing RNAseq or some sort of protein sequencing.

**gege_55** · 03-05-2013, 03:12 PM

Yes, I meant small transcripts. In the end I want to translate the RNAseq assembly to find the peptide sequences. So the transcripts of interest will be 90 to 120 bp long (maybe a bit longer with UTR).

Sorry for the confusion.

**dpryan** · 03-07-2013, 02:06 AM

Since no one else has given any input, I'll give a little advice.

Firstly, you'll have to keep in mind that you can only even look for a fraction of the possible peptides. Remember that peptides can be divided into (in this case) 3 groups: small proteins arising from their own genes (e.g. insulin), cleaved proteins arising from larger genes (e.g. amyloid beta from APP), synthesized small peptides (e.g. a lot of the neurotransmitters). You will only have a chance at finding the first group with RNAseq (unless you can predict cleavage sites, in which case you could maybe predict the second class).

What I would do is to first do the de novo assembly, and then exclude any contigs >300bp (since you indicated that those are unlikely candidates). I would also filter contigs that are too small (otherwise, you'll get a bunch of small ncRNAs). I would then further filter things by minimum open reading frame length. Obviously, if a contig has an ORF of 30bp, then it's unlikely to meet your criterion. You might be able to further rank things by taking codon usage or similar characteristics into account (obviously you would want to see if these characteristics are predictive in other species). In short, see what properties would distinguish these sorts of peptides in other species where the peptides are known.

That's probably the best you can do without having any predicted conserved motifs. Presumably the real transcript would have a lot of copies, so you want to filter by that. If these peptides tend to have a similar structure, you might try to do a prediction on the ORFs and filter accordingly. Of course, this assumes that the actual peptide isn't just cleaved from something else...

Personally, I would look into screening proteins at the same time, since that'll probably be more informative (heck, a couple 2D gels could probably narrow things down both in size and other characteristics).

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Finding SMALL peptides in de novo assembly

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News