Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA with multi read files

    Hi I have never used BWA before and so was wondering if anyone knows how I can use BWA to map my reads to my reference genome. My data is as follows:

    L001_R1.fasta
    L001_R2.fasta
    L002_R1.fasta
    L002_R2.fasta
    etc
    There are 12 files in total and as you can see they are paired reads.
    So does anyone have any idea?
    Any help would be much appreciated.

    Thanks,
    Tom

  • #2
    Also I have indexed my referance genomes (I have 93 genomes together in a single fasta) and been provided with 5 files : Combined_genomes_index_bwa.amb Combined_genomes_index_bwa.ann Combined_genomes_index_bwa.bwt Combined_genomes_index_bwa.pac Combined_genomes_index_bwa.sa

    How do I use these for my mapping as I want to use bwa mem as the reads length range from 40 to 140 but the manual doesnt state how to input these index files

    bwa mem [-aCHMpP] [-t nThreads] [-k minSeedLen] [-w bandWidth] [-d zDropoff] [-r seedSplitRatio] [-c maxOcc] [-A matchScore] [-B mmPenalty] [-O gapOpenPen] [-E gapExtPen] [-L clipPen] [-U unpairPen] [-R RGline] [-v verboseLevel] db.prefix reads.fq [mates.fq]

    Again, any help would be great.

    Comment


    • #3
      From the manual:

      Code:
      bwa mem ref.fa read1.fq read2.fq > aln-pe.sam
      Just make sure your fasta file is in the same dir as your index files and there shouldn't be any problems.

      As to your first question, I presume the above command shows how to use PE data?

      Comment


      • #4
        Use the base name of the index files as 'db.prefix', so you would use
        'Combined_genomes_index_bwa' as the name of the index (and provide the complete path to the index files if necessary).

        The online manual doesn't seem to specify what to do if your reads are in more than 1 file. Try listing all the R1 files separated by commas but no spaces, followed by the R2 files, again separated by commas but no spaces.

        Code:
         L1_R1.fastq,L2_R1.fastq,L3_R1.fastq... L1_R2.fastq,L2_R2.fastq,L3_R2.fastq...
        If that doesn't work you could combine all your R1 files together and all your R2 files.

        Comment


        • #5
          bruce01, I have tried your method and it just didnt work at all, no error message just a list of the commands.Will try the method you suggest mastal now.
          Last edited by thh32; 02-26-2014, 05:45 AM.

          Comment


          • #6
            Right using mastal's method this error occured [E::main_mem] fail to open file ' LIST OF ALL FILES'

            It appears to be that if I leave spaces between the files it just ignores the whole thing and provides a list of the commands however if I remove the space and put commas then it see's it as one long file name which it cannot locate.

            Comment


            • #7
              OK, can you tell me the exact command you used, and whether you want to align all your fastq at the same time (ie L001+L002+...+L00n), or individually (ie L001 makes single SAM file)?

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              31 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Working...
              X