Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BFAST and miRNA precursor reference

    Dear users,
    I'm using BFAST to align miRNA reads from SOLiD ABI to the precursor miRNA reference.

    I can't understand one thing.

    When I used the reference of miRNA precursor as (for example):
    >hsa-mir-548d-1 MI0003668 Homo sapiens miR-548d-1 stem-loop
    AAACAAGUUAUAUUAGGUUGGUGCAAAAGUAAUUGUGGUUUUUGCCUGUAAAAGUAAUGG
    CAAAAACCACAGUUUCUUUUGCACCAGACUAAUAAAG
    >hsa-mir-661 MI0003669 Homo sapiens miR-661 stem-loop
    GGAGAGGCUGUGCUGUGGGGCAGGCGCAGGCCUGAGCCCUGGUUUCGGGCUGCCUGGGUC
    UCUGGCCUGCGCGUGACUUUGGGGUGGCU
    ...
    ...

    (extracted by miRBasev13)
    the program returns me the error at the localalign step saying me that read and reference don't match.

    If I use the same reference, adding to each miRNA sequence 35 N (at the begin and at the end of each one), as, for example:

    >hsa-mir-1277
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNACCTCCCAAATATATATATATATGTACGTATGTGTATATAAATGTATACGTAGATATATATGTATTTTTGGTGGGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    >hsa-mir-1278
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATTTGCTCATAGATGATATGCATAGTACTCCCAGAACTCATTAAGTTGGTAGTACTGTGCATATCATCTATGAGCGAATAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    ...
    ...

    the program runs and I can terminate my alignment.

    So, I don't know how to explain it to myself!

    I then compared the number of counts found by BFAST program (about 75000) with the counts found by the RNA_pipeline of corona lite (ABI) (about 22000).
    How can I explain this so large difference?
    Thank you very much for the help!

    Maria Elena

  • #2
    Originally posted by m_elena_bioinfo View Post
    Dear users,
    I'm using BFAST to align miRNA reads from SOLiD ABI to the precursor miRNA reference.

    I can't understand one thing.

    When I used the reference of miRNA precursor as (for example):
    >hsa-mir-548d-1 MI0003668 Homo sapiens miR-548d-1 stem-loop
    AAACAAGUUAUAUUAGGUUGGUGCAAAAGUAAUUGUGGUUUUUGCCUGUAAAAGUAAUGG
    CAAAAACCACAGUUUCUUUUGCACCAGACUAAUAAAG
    >hsa-mir-661 MI0003669 Homo sapiens miR-661 stem-loop
    GGAGAGGCUGUGCUGUGGGGCAGGCGCAGGCCUGAGCCCUGGUUUCGGGCUGCCUGGGUC
    UCUGGCCUGCGCGUGACUUUGGGGUGGCU
    ...
    ...

    (extracted by miRBasev13)
    the program returns me the error at the localalign step saying me that read and reference don't match.

    If I use the same reference, adding to each miRNA sequence 35 N (at the begin and at the end of each one), as, for example:

    >hsa-mir-1277
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNACCTCCCAAATATATATATATATGTACGTATGTGTATATAAATGTATACGTAGATATATATGTATTTTTGGTGGGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    >hsa-mir-1278
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATTTGCTCATAGATGATATGCATAGTACTCCCAGAACTCATTAAGTTGGTAGTACTGTGCATATCATCTATGAGCGAATAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    ...
    ...

    the program runs and I can terminate my alignment.

    So, I don't know how to explain it to myself!

    I then compared the number of counts found by BFAST program (about 75000) with the counts found by the RNA_pipeline of corona lite (ABI) (about 22000).
    How can I explain this so large difference?
    Thank you very much for the help!

    Maria Elena
    What version are you using? What post processing options are you using?

    Comment


    • #3
      Thanks Dr.Homer,
      I'm using bfast-0.6.0d

      The options and the parameters, after the fasta2brg and the index step, that I use are:

      > bfast match -f hsa_human.fa -r file.fastq -A 1 > file.bmf

      > bfast localalign -f hsa_human.fa -m file.bmf -A 1 > file.baf

      > bfast postprocess -f hsa_human.fasta -a 3 -O 3 -i file.baf -A 1 > output.sam

      With fasta file without NNN, the program crashes at localalign step.

      Comment


      • #4
        Originally posted by m_elena_bioinfo View Post
        Thanks Dr.Homer,
        I'm using bfast-0.6.0d

        The options and the parameters, after the fasta2brg and the index step, that I use are:

        > bfast match -f hsa_human.fa -r file.fastq -A 1 > file.bmf

        > bfast localalign -f hsa_human.fa -m file.bmf -A 1 > file.baf

        > bfast postprocess -f hsa_human.fasta -a 3 -O 3 -i file.baf -A 1 > output.sam

        With fasta file without NNN, the program crashes at localalign step.
        Only valid DNA bases are allowed as well as N (so only ACGTN). It looks like you have Us in the reference for the miRNA. Convert those to Ts in your reference.

        Comment


        • #5
          Dr. Homer,
          in my reference there are not U but it contains only DNA bases. For example:

          >hsa-let-7a-2
          AGGTTGAGGTAGTAGGTTGTATAGTTTAGAATTACATCAAGGGAGATAACTGTACAGCCTCCTAGCTTTCCT
          >hsa-let-7a-3
          GGGTGAGGTAGTAGGTTGTATAGTTTGGGGCTCTGCCCTGCTATGGGATAACTATACAATCTACTGTCTTTCCT

          So, I don't think that this is the problem!

          Comment


          • #6
            Originally posted by m_elena_bioinfo View Post
            Dr. Homer,
            in my reference there are not U but it contains only DNA bases. For example:

            >hsa-let-7a-2
            AGGTTGAGGTAGTAGGTTGTATAGTTTAGAATTACATCAAGGGAGATAACTGTACAGCCTCCTAGCTTTCCT
            >hsa-let-7a-3
            GGGTGAGGTAGTAGGTTGTATAGTTTGGGGCTCTGCCCTGCTATGGGATAACTATACAATCTACTGTCTTTCCT

            So, I don't think that this is the problem!
            Give me your reference and a set of reads an I will test it out myself. Thanks!

            Nils

            Comment


            • #7
              Has anyone tried aligning to the mature sequences instead of the entire precursor using BFAST. I've recently jumped on the BFAST bandwagon for genomic data and am trying to find out whether miRNA is feasible. I have two approaches: 1. align to only the mature sequences to identify the known sequences, 2. align to the entire genomic reference and look up and down 100bp of the aligned position for a second hit in the reverse compliment to identify a potential loop structure.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 08:47 AM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              60 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              59 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              54 views
              0 likes
              Last Post seqadmin  
              Working...
              X