Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • mapping microRNA data -- less than 1% maps

    Hi guys,

    I am not new to NGS/RNA-Seq, but I am new to miRNA sequencing... I want to map human miRNA reads (HiSeq 2500, single end, 50bp) to the miRNA databases (mature and hairpin, I took both from here: http://www.mirbase.org/ftp.shtml; I also checked that the headers of the sequences contain "human", so I don't map against the wrong database). This is the procedure I follow:

    (1) using cutadapt, remove Illumina adapters, discard reads that are too short (<17bp) or too long (>35bp) after removing adapters -- works fine, no errors/warnings, removes 5-10% of reads from the initial fastq files.
    (2) index the miRNA databases.
    (3) map trimmed reads (1) to the indexed databases (2)

    However, when I do it using stampy, bwa, bowtie or bowtie2, I get less than 0.5% of reads mapped... I believe I'm doing something wrong at the indexing step or am missing something at the alignment step (duh...). Does anyone have an idea of what I could be doing wrong?

    You could find all my commands here:
    bioinformatics pipeline to analyze micro RNA sequencing data - File not found · jknightlab/mirna_pipeline


    And I also copy them here:

    **Stampy**

    > stampy.py -g human_mature_mirna -H human_mature_mirna
    > stampy.py -g human_mature_mirna -h human_mature_mirna -M reads.fastq -o alignment.stampy.sam


    **bowtie**

    > bowtie-build mature_dna_human.fa mature_mirna
    > bowtie -l 8 mature_mirna reads.fastq > alignment.bowtie.sam

    **bowtie2**

    > bowtie2-build mature_dna_human.fa mature_mirna.bowtie2
    > bowtie2 -L 8 -x mature_mirna.bowtie2 reads.fastq > alignment.bowtie1.sam

    **BWA**

    > bwa index -a is mature_dna_human.fa
    > samtools faidx mature_dna_human.fa
    > java -jar CreateSequenceDictionary.jar REFERENCE=mature_dna_human.fa OUTPUT=mature_dna_human.dict
    > bwa aln -l 8 Database_for_mirna/mature_dna_human.fa reads.fasrq > alignment.bwa.sai

    I hope someone could help!

    Cheers,
    Irina

  • #2
    Hi Irina,

    I'm facing the same problem now. Did you find a solution? I've tried aligning my miRNA-seq data on the mature sequences from miRBase but most of them were not aligned.

    Please, do you have any tip?

    best,

    Comment


    • #3
      For starters, try alignment to the whole genome to see 1) if the unmapped reads are from your species or contaminants, and 2) where they align (e.g., if the library is actually RNA rather than miRNA).

      Comment


      • #4
        It's unlikely the alignment or indexing is messed up across so many aligners. I would suggest the adapter trimming is not working perfectly.

        Try using fastqc on your fastq files before and after trimming ? Big difference ? Expected size range hit ? Plenty of 21nt reads left ?

        Also, additional adapters which your provider did not tell you about may be present.

        Comment


        • #5
          When you downloaded the human microRNA sequences, did you replace U-s with T-s? Bowtie-build input should be DNA (with Ts) not RNA, it skips Us.

          Comment


          • #6
            Originally posted by wingless View Post
            When you downloaded the human microRNA sequences, did you replace U-s with T-s? Bowtie-build input should be DNA (with Ts) not RNA, it skips Us.
            This. Also, the small RNA adapter is different from the "standard" Illumina adapter, so make sure you trimmed the correct adapter sequence.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            9 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            50 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            67 views
            0 likes
            Last Post seqadmin  
            Working...
            X