Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can't get paired end data to align with Bowtie

    Hi to all of you,
    I'm more or less brand new at bioinformatics, eager to learn, but tend to end up mostly feeling stupid.

    I was given a set of miRNA paired end files (run on Illumina NextSeq 500) to identify novel miRNAs as potential biomarkers in blood. Before I started to think too much of HOW to actually do the identification and differential analysis of those sequences I read up on alignment tools. So I ended up aligning my files with Bowtie2 after trimming with cutadapt. Got an overall alignment rate of around 94% for all files.

    Then I came across miRDeep2 and saw the potential with that pipeline. Since I also want to learn I tried simply to align my files with Bowtie first and thought I could convert them before feeding them into the pipeline. However, for some reason I get only 1% alignment rate for my reads. Same files I got 94% alignment rate for with Bowtie2. I tried to align just one of the paired end files and got an alignment rate of over 80%. So why can't I get the paired end files to align as nicely? Any suggestions? Now that I DO have paired end it feels like a waste to only use one of the files as "single end".

    Bowtie for paired end:
    Code:
    ./bowtie -n 0 -l 15 -e 80 RefCat -q -1 /pathtomystorage/trimmed/trimmed_1.fq -2 /pathtomystorage/trimmed/trimmed_2.fq -S /pathtomystorage/alignment/bowtie_alignment.sam
    Result:
    # reads processed: 10229619
    # reads with at least one reported alignment: 91017 (0.89%)
    # reads that failed to align: 10138602 (99.11%)
    Reported 91017 paired-end alignments to 1 output stream(s)


    Bowtie for "single end":
    Code:
    ./bowtie -a --best -n 0 -l 15 -e 80 RefCat -q /pathtomystorage/trimmed/trimmed_1.fq -S /pathtomystorage/alignment/bowtie_alignment_1_.sam
    Result:
    # reads processed: 10229619
    # reads with at least one reported alignment: 8242967 (80.58%)
    # reads that failed to align: 1986652 (19.42%)
    Reported 73843253 alignments to 1 output stream(s)

    I have tried to adjust the parameters for -X, -I and --ff/--rf/--fr without much improvement...

  • #2
    Have you scanned trimmed these data files. Since you are working with miRNA they are likely to have illumina adapters in reads. Since bowtie does not do gapped alignment it is likely not able to align the reads.

    Your reads probably will overlap (since the inset size must be small). You could merge the R1/R2 reads (using a tool like bbmerge) before you do scan/trimming followed by alignments.

    Comment


    • #3
      Originally posted by GenoMax View Post
      Have you scanned trimmed these data files. Since you are working with miRNA they are likely to have illumina adapters in reads. Since bowtie does not do gapped alignment it is likely not able to align the reads.

      Your reads probably will overlap (since the inset size must be small). You could merge the R1/R2 reads (using a tool like bbmerge) before you do scan/trimming followed by alignments.
      Thank you for your suggestion. I had trimmed the reads for adapters prior to alignment and find it strange I can align 94% with Bowtie2 but only 1% with Bowtie. Would you suggest I merge the trimmed files, or should I merge the untrimmed files?

      Thnx!

      Comment


      • #4
        Well, merging reads prior to trimming with default options for BBMerge only merged 3.2% of my reads. 96.7% ended up as ambiguous, at least for one of the samples.

        Comment


        • #5
          For BBMerge with micro-RNA, you need to add the flag mininsert=17. The default is 35, which is too long for micro-RNA libraries.

          Comment


          • #6
            Originally posted by Brian Bushnell View Post
            For BBMerge with micro-RNA, you need to add the flag mininsert=17. The default is 35, which is too long for micro-RNA libraries.
            Thank you kindly for that suggestion, I now joined 94.2% of the reads instead of 3%

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Today, 08:47 AM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            59 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Working...
            X