Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • different number of reads fastq file - bam file

    Hello,

    I am working with PE 150bp Illumina HiSeq X-ten sequencing data.
    I have generated the number of reads in my trimmed forward + reserve paired fastq files using FASTQC, by counting the number of lines with zcat and by looking at Trimmomatic results. The counts of the number of reads are concordant.

    Then, I have mapped the forward+reverse paired reads to the reference genome using bwa-mem and run bamtools on the output to get the total number of reads and the number of mapped reads. Both the total number of reads and the number of mapped reads are higher than the number of reads in the fastq files (reverse+forward).

    Could it be because the mapped reads can have more than one alignment and the total number of reads does not correspond to the input number of reads but the number of mapped reads which may have multiple alignment + the number of unmapped reads?

    Thanks,

    Marie

  • #2
    It depends on the command options that you used to run bwa-mem. Default output for multi-mappers is to arbitrarily return one hit, while chimeric reads report multiple hits. Chimeric reads contain the bit flag 0x800. Those can be identified using SAMtools command 'samtools view -f 256 aligned.bam'.

    Comment


    • #3
      Thanks for your answer.
      I have used the default options in bwa mem. I have run the samtools command and can't find any 0x800 flag.
      However, I have read the following on bwa manual: The BWA-MEM algorithm performs local alignment. It may produce multiple primary alignments for different part of a query sequence. This is a crucial feature for long sequences. However, some tools such as Picard’s markDuplicates does not work with split alignments. One may consider to use option -M to flag shorter split hits as secondary.

      This may explain the high number of reads after mapping?

      Thanks,

      Marie

      Comment


      • #4
        I am analysing WGBS data (Illumina Hiseq) of Bovine FAT tissues for differential methylation. I used TrimGalore for adaptor removal and qulity check for the the pilot sample . All the adoptors were removed and quality was good (both survived paired end reads 98%). Then I used Bismark for unique alignment to the Bisulfite converted reference genome and I got 57.8% mapping efficiency.
        I want to ask wether 57.8% mapping efficiency is good to proceed further ? What is gold standard for mapping efficiency in WGBS? Please guide me.
        Thanks,
        Naveed.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 03-27-2024, 06:37 PM
        0 responses
        13 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-27-2024, 06:07 PM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        53 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        69 views
        0 likes
        Last Post seqadmin  
        Working...
        X