Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • would like to extract cds region and concatenate fasta sequences

    Hello, everyone,

    Could you help me about the following issues?

    I got hundreds of candidate genes which I am interested in and I have extracted the corresponding fasta sequences from transcriptome database of other species. And now I would like to only get their cds region and get rid of the UTR region. And afterwards, I would like to concatenate all the fasta sequences for each species. I wonder perl or python might be helpful to do this, but I cannot find the proper answer from the internet or in this forum. Did anyone have the experience?

    Thanks in advance!

    All the best,

    Sadiexiaoyu

  • #2
    This sounds a little tricky. Often, I find it useful to concatenate all sequences in a fasta file into a single sequence (normally padded with some number of Ns between the original sequences), so the BBMap package contains fuse.sh for this purpose. But, that assumes you already have the exonic sequence. You'll need to look for another tool to get that...

    Note that this is a bit more tricky in the presence of differential splicing, which you obviously have, given that your species has introns. In fact, it might be helpful if you could provide as much information as possible - for example, what species are these? What are you trying to do? What kind of data do you have? Etc.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin


      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
      Yesterday, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    55 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    52 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    45 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    55 views
    0 likes
    Last Post seqadmin  
    Working...
    X