Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA on raw sequences?

    Hi guys,

    I previously run Bowtie on short read sequences in raw format (i.e. one sequence per lane with no other lines in the file). I want to compare the alignment by Bowtie to BWA. I looked through BWA manual (http://bio-bwa.sourceforge.net/bwa.shtml) and it seems to me that only fastq input format is acceptable. Is it so? And if it is so, can I make a fake fastq with mock ID and quality lines and expect BWA to work appropriately?

    Also, I am not sure if I should use 'bwa aln' or 'bwa samse'. I want to produce SAM in the end, so it seems that I need to use samse. However I don't see how I can control any of the parameters with samse. I would like to discard alignments that map to more than one place in the genome, not to allow any gaps, at most allow one mismatch and run the program on 4 cores and produce SAM. It's really easy with Bowtie, but is it something that I can do with BWA?

    thanks
    "Let’s start with the three fundamental Rules of Robotics...."

  • #2
    You run aln, then samse.

    aln with -e 0 -n 1 -t 4 should turn off gaps, only allow one mismatch, and run on 4 threads. After you run samse on the output, you can filter the .sam file with grep, or whatever. XT:A:U will be in the unique lines, XT:A:R lines are repetative.

    Comment


    • #3
      bwa alignment how

      I like to know how to run bwa to align my pe reads to the reference genome

      first i tried to create the reference of the bac genome bwa index -a is /Reference/ref.fa

      I tried the following command to align
      bwa sampe /DATA/Read1.fastq /DATA/Read2.fastq

      But i couldn t find an option to put my reference sequence for alignment

      and should i use bwa aln also if so y and how?

      Thanks

      Comment


      • #4
        Sampe clearly takes five files as input, not just two, and yes, the name of the reference is the first of those inputs.

        The method for using bwa is you index the sequence (you only have to do this one no matter how many different datasets you align to that genome). Then you use bwa aln to make intermediate files, and sampe or samse to turn those intermediate files into .sams. Aln can be the most time consuming step; that's the one where you really ought to use the -t option to utilize multiple processors, if you have them.

        Also, .sams are huge, you really ought to pipe the output of sampe into samtools view to convert it to a .bam right away. You can always convert it (or more likely, a small part of it) back to .sam with samtools view later if you need to eyeball something.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        29 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        31 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        52 views
        0 likes
        Last Post seqadmin  
        Working...
        X