Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • kostask
    Junior Member
    • Sep 2015
    • 8

    extract_only_concordant_paired_reads_from_bam_file

    Hi everyone,

    I am trying to extract only concordant paired reads from my bam file using a command that I found in posts with similar subject:







    However the command that I use:

    samtools view -b -f 0x2 accepted_hits.bam -o accepted_hits_conc.bam

    results to the production of the accepted_hits_conc.bam file, which is much smaller than expected.

    Specifically the original accepted_hits.bam file has size 1.4 Gb while the accepted_hits_conc.bam has size only 141 Mb.

    Also by running samtools I get:

    samtools view -c accepted_hits.bam
    34745648

    samtools view -c accepted_hits_conc.bam
    3188162

    Now what troubles me is that the align_summary.txt returned by tophat2 reports:

    Left reads:
    Input : 18363154
    Mapped : 17461735 (95.1% of input)
    of these: 1424801 ( 8.2%) have multiple alignments (1497233 have >1)
    Right reads:
    Input : 18363154
    Mapped : 17283913 (94.1% of input)
    of these: 1424801 ( 8.2%) have multiple alignments (1485742 have >1)
    94.6% overall read mapping rate.

    Aligned pairs: 16585377
    of these: 1424801 ( 8.6%) have multiple alignments
    3788640 (22.8%) are discordant alignments
    69.7% concordant pair alignment rate.

    If there are 16585377 aligned pairs from an input of 18363154 paired end reads and a 69.7% concordant pair alignment rate, why do I get so small output from samtools?

    Also I saw in other posts that using this approach is preferable to using the --no-discordant option of tophat2. But why is that?

    Shouldn't I get exactly the concordant paired reads as output, if I specify the --no-discordant and --no-mixed options in tophat2?
  • westerman
    Rick Westerman
    • Jun 2008
    • 1104

    #2
    Answering your first question about samtools. One thing you could try is to see exactly what flags you have. Try
    Code:
    samtools view accepted_hits.bam | cut -f 2 | sort | uniq -c
    That will, at least, tell you what you have in the file.

    Comment

    • kostask
      Junior Member
      • Sep 2015
      • 8

      #3
      Thank you westerman

      I saw exactly the flags and the corresponding reads for each one.

      Wouid it be correct if I wanted to extract reads with certain flags (specifically 67, 131, 115 and 179) to do so with:

      samtools view -b -f number_of_flag accepted_hits.bam -o flag_accepted_hits.bam

      and then merge them together with:

      samtools merge output_accepted_hits.bam flag1_accepted_hits.bam ...

      It may be a silly question but with the samtools merge, the reads merge or they concatenate into a single bam file?

      Comment

      Latest Articles

      Collapse

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, 06-09-2026, 11:58 AM
      0 responses
      22 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-05-2026, 10:09 AM
      0 responses
      27 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-04-2026, 08:59 AM
      0 responses
      38 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-02-2026, 12:03 PM
      0 responses
      61 views
      0 reactions
      Last Post SEQadmin2  
      Working...