Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • weird kmer-content peak in RNA-seq data

    Hi all,

    I have attached a pic of the kmer-content of my RNA-seq experiment.
    Input was a fastq-file, 51bp reads, over 30 million reads, RNA-seq on an Illumina Hiseq.
    At about the 21st position in the reads, I see the AAAAA 5-mer suddenly rising. Does anyone have a clue what might be causing that? I see it in all my samples. RNA samples were collected after arresting translation with cycloheximide.
    Could it be that something is wrong with the fragment size? If so, how do I check that??

    Thanks a million!

    Karel
    Attached Files

  • #2
    Hi Karel
    It's possible that the AAAA k-mer that you're seeing at around 21 base pairs are arrested transcripts caused by cyclohexamide treatment. You can check this possibility out by looking at the length distribution plot in FASTQC. However, if you see uniform 51 base pair reads, this doesn't mean that your AAAA k-mer is not due to terminated transcripts because you may have sequenced the 3' UTR. I would probably try to address this computationally by segregating out the reads with the AAAA k-mer at 21 bases. I would then trim them and run the shorter (ie 19 bp) and longer (ie 50 bp) reads through my analyis pipeline separately. I'm assuming the goal of your experiment is looking at genes that are transcribed rapidly or with whatever stimulus you gave your cells before Chx treatment vs slowly/not in response to the stimulus. This should tell you something about these genes.

    Comment


    • #3
      Hi
      I think the kmer contents of my data is pretty darn bad. Is there anyway to filter them or trim them from my data and analyze them separately.
      Any useful comment is highly appreciated.
      Attached Files

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      18 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      22 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      17 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      49 views
      0 likes
      Last Post seqadmin  
      Working...
      X