Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to use bwa? and how to set paremeters for 75bp reads

    Hello everybody.
    i'm a new user of bwa to align solexa reads to the reference genome hg19.i have finished bwa index for the hg19 genome,but not very clear how to use bwa aln command,especially all the paremeters in bwa aln:
    i want to know the following conception such as:

    1. what's seed length? whether it is equal to reads length,my reads length is 75bp

    2. how to set number of threads? How much is best?

    3. how to set -q INT quality threshold for read trimming down to 35bp [0]
    why 35bp?
    how to set this paremeter for my 75bp reads.

    4. which options are essential just like the following paremeters?

    thank you all very much. yours sincerely Alex. your helps are highly appreciated.

    my email is [email protected]




    bwa aln ********************
    Options: -n NUM max #diff (int) or missing prob under 0.02 err rate (float) [0.04]
    -o INT maximum number or fraction of gap opens [1]
    -e INT maximum number of gap extensions, -1 for disabling long gaps [-1]
    -i INT do not put an indel within INT bp towards the ends [5]
    -d INT maximum occurrences for extending a long deletion [10]
    -l INT seed length [32]
    -k INT maximum differences in the seed [2]
    -m INT maximum entries in the queue [2000000]
    -t INT number of threads [1]
    -M INT mismatch penalty [3]
    -O INT gap open penalty [11]
    -E INT gap extension penalty [4]
    -R INT stop searching when there are >INT equally best hits [30]
    -q INT quality threshold for read trimming down to 35bp [0]
    -c input sequences are in the color space
    -L log-scaled gap penalty for long deletions
    -N non-iterative mode: search for all n-difference hits (slooow)

  • #2
    Try taking a random slice of your data & play with parameters. All of this is tuning -- the defaults will give you reasonable results.

    The number of threads is almost easy: no more than the number of processors (cores) you actually have. BFAST suggests using a power of 2 but bwa doesn't mention this; I don't know if it matters in that case. More threads is faster, by a not quite linear amount.

    Comment


    • #3
      thank you very much.
      i find another question in the process.
      Can you tell me the function of the bwa index? which result or file generated by bwa index were used in the following process? i find nothing of the bwa index result was used in the following bwa aln and bwa samse.et.al.
      i want to know why?

      Comment


      • #4
        seed is the first INT nucleotides in your read that usually have high quality .. so you can play with the length of the seed and the number of #diff in it.

        the index is a way to make alignement very fast by creating 8 files from your reference sequence. so when doing the alignement the program will have just the needed info from the index without returning to the reference sequence that might be very long.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        30 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        32 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        53 views
        0 likes
        Last Post seqadmin  
        Working...
        X