Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Finding transcriptome matches for thousands of 21-mers

    I have received a project and could use some advice.

    My task is to find sequence matches in mRNA databases across 20+ taxa. I was presented an EPA memorandum that outlines a method I was requested to replicate. I am not sure it is the best method and the memorandum didn't go into much detail. I played around a bit and could use a bit of advice.

    Setup:
    I have an unspecified number (I haven't been told yet) of dsRNA segments approximately 300bp in length. I need to match these to the human transcriptome as well as over 20 other taxa.

    Criteria:
    For each 300bp dsRNA, I am to find mRNA in the taxa that have 14 or more matches within a 21bp window. Then I sort the data by taxon, transcripts matched, and the annotation for the matched mRNA.

    Approach (this is where I have questions):
    The EPA memorandum says the Burrows-Wheeler Aligner (BWA) was used to align a 21-mer sliding window along the target transcriptomes to look for matches of 14 or greater within the window. The PI said to create all 21-mers using a sliding window along the dsRNA sequence. Easy enough.

    Here are my questions:
    1. Is BWA the best approach to use? I've never used BWA MEM for anything so small. Is there a better approach?
    2. How should I set the parameters for the BWA for this case? The defaults are inadequate, but I'm just taking stabs in the dark to see what falls out. So far, I have adjusted:
    1. Minimum seed length (-k) down to 3
    2. band width (-w) down to 7
    3. ignore alignment scores lower than (-T) range from 1 to 21
    4. gap open penalty (-O) between 1 and 6
    5. mismatch penalty (-B) between 1 and 4
    1. Why do I see a Bitwise Flag of 0? In adjusting the parameters, the resulting SAM will contain matches where the Bitwise Flag is 0. This seems like nonsense to me, suggesting that I may be on the wrong track.

    Sample Execution:
    ./bwa mem -k 5 -B 1 -O2 -T 5 ../ncbi_dataset/GCF000001405.40.rna.fna ../seqA.fasta | gzip -3 > ../bwa_results/aln_seqA.sam.gz

    Bitwise Flag == 0?
    seqA_332_353 16 XM_011510229.4 7276 0 7S14M * 0 0 TGATCGGTGTAAATCCCATAT * NM:i:0 MD:Z:14 AS:i:14 XS:i:14
    seqA_333_354 0 XM_017008212.3 1680 0 7S14M * 0 0 TATGGGATTTACACCGATCAA * NM:i:0 MD:Z:14 AS:i:14 XS:i:13
    seqA_334_355 0 XM_017008212.3 1680 0 6S15M * 0 0 ATGGGATTTACACCGATCAAC * NM:i:1 MD:Z:14A0 AS:i:14 XS:i:13
    seqA_335_356 0 XM_017008212.3 1680 0 5S14M2S * 0 0 TGGGATTTACACCGATCAACT * NM:i:0 MD:Z:14 AS:i:14 XS:i:13​


Latest Articles

Collapse

  • seqadmin
    Choosing Between NGS and qPCR
    by seqadmin



    Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
    10-18-2024, 07:11 AM
  • seqadmin
    Non-Coding RNA Research and Technologies
    by seqadmin




    Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

    Nobel Prize for MicroRNA Discovery
    This week,...
    10-07-2024, 08:07 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 05:31 AM
0 responses
10 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-24-2024, 06:58 AM
0 responses
20 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-23-2024, 08:43 AM
0 responses
50 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-17-2024, 07:29 AM
0 responses
58 views
0 likes
Last Post seqadmin  
Working...
X