Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merge sai file of bwa ?

    Hi everyone

    I used bwa to align reads of multiple lane against genome,

    Here I have a question.

    For example,

    If I have s_1_1_sequence.sai and s_1_2_sequence.sai and.. s_1_6_sequence.sai

    Could I merge these sai file to s_1_total_sequence.sai ,and running bwa samse to convert to sam file ?

    If I can't do this ,I must covert all sai sam independently .

    And How could I use samtools merge six sam file to one total.sam

    I want to know the usage of command.

    Thank!!!

  • #2
    Hi,

    I use this commands to convert and then merge .sai files, maybe exists other commands more better and faster, you need picard tools

    bwa sampe -f name.sam /home/jesus/Documentos/trabajo/genoma/hg19 name_1.sai name_2.sai name_1.fq.gz name_2.fq.gz
    java -Xmx4g -Djava.io.tmpdir=../tmp -jar ~/picard-tools/SortSam.jar SO=coordinate I=name.sam O=name.bam VALIDATION_STRINGENCY=LENIENT CREATE_INDEX=true
    samtools sort name.bam name.sorted
    samtools index name.sorted.bam


    this commands for all *.sai files and when you have all *.bam files indexes you can merge them with,

    samtools merge final_name.bam name1.bam name2.bam ...


    I hope you can use this for your porpouse,

    jesus

    Comment


    • #3
      But will that work if you have paired-end reads?

      Or would it be better to concatenate the gzipped files first?

      Thanks.

      Comment


      • #4
        yes, it works for paired-end reads,

        you can get more information on: http://bio-bwa.sourceforge.net/bwa.shtml

        you are well-come

        Comment


        • #5
          I know that bwa works on paired-ends. I just asked my question very badly. I have since found my answer, but I want to post it here in case someone else needs to find it (as well as ask more questions below).

          So our output for a single barcoded pair-end read looks like
          CRC1_Nov2011_ACAGTG_L003_R1_001.fastq.gz
          CRC1_Nov2011_ACAGTG_L003_R1_002.fastq.gz
          CRC1_Nov2011_ACAGTG_L003_R1_003.fastq.gz
          CRC1_Nov2011_ACAGTG_L003_R1_004.fastq.gz
          CRC1_Nov2011_ACAGTG_L003_R1_005.fastq.gz
          CRC1_Nov2011_ACAGTG_L003_R1_006.fastq.gz
          CRC1_Nov2011_ACAGTG_L003_R2_001.fastq.gz
          CRC1_Nov2011_ACAGTG_L003_R2_002.fastq.gz
          CRC1_Nov2011_ACAGTG_L003_R2_003.fastq.gz
          CRC1_Nov2011_ACAGTG_L003_R2_004.fastq.gz
          CRC1_Nov2011_ACAGTG_L003_R2_005.fastq.gz
          CRC1_Nov2011_ACAGTG_L003_R2_006.fastq.gz

          I had been trying to figure out if R1_001 and R2_001 were the read-pairs or if the read-pairs were dispersed unevenly across the multiple files. If the answer was no, then this strategy wouldn't work for read pairs. However, at least from our core, R1_001 and R2_001 are the read-pairs.

          However, I have another several questions for you, Jesus (or whoever wants to answer):

          1. Why do you use Picard SortSam to make a sorted bam with an index, but then use Samtools to sort and index again? Am I misunderstanding the steps?

          2. Is the output of samtools merge sorted? I know the input has to be, but it's not clear to me if the output is.

          3. Is samtools merge better than Picard's MergeSamFiles? It looks like the latter does not require sorted input, would sort the output and can write BAM output along with an index, thereby removing a number of steps. I can't tell if it needs an indexed file going in, though.

          Thank you all for being so patient with a newbie.

          Comment


          • #6
            you could either combine the fastq files before alignment, or the sam/bam files after alignment. i dont think you can combine sai files.

            R1 is the first mate, R2 is the second mate of a paired end data. they should have the same length (try wc -l)

            samtools merge takes sorted bam as input and also generates a sorted bam. otherwise use samtools cat.

            i think picard MergeSamFiles adds a read group tags storing which input file the read came from. if you dont need to keep this information use samtools merge.

            i suggest to proceed in the following way:
            bwa aln ...
            bwa sampe ... | samtools view -Sbu - | samtools sort - alignment_00X
            samtools index alignment_00X.bam (not really necessary)
            samtools merge ...
            samtools index ...

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            30 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            32 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            52 views
            0 likes
            Last Post seqadmin  
            Working...
            X