Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • queston for count of properly-paired end

    I am using samtools and bamToBed to get the properly-paired read like this:

    samtools view -bf 0x2 read.bam | bamToBed -i stdin > A.bed

    then I use coverageBed to get counts for eahc gene in hg18 like this:

    coverageBed -a A.bed -b hg18.bed.sorted -s > gene_count.txt

    chr1 67051159 67113208 uc009waw.1 0 0 62049 0.0000000
    chr1 67051159 67129093 uc009wax.1 0 0 77934 0.0000000
    chr1 67051159 67163158 uc001dcx.1 3 108 111999 0.0009643
    chr1 67075873 67163158 uc001dcy.1 3 108 87285 0.0012373
    chr1 16762998 16791395 uc009vor.1 14 445 28397 0.0156707
    chr1 16762998 16812569 uc009vos.1 19 625 49571 0.0126082
    chr1 16765489 16786343 uc001ayz.1 12 373 20854 0.0178863
    chr1 16765489 16786343 uc009vot.1 12 373 20854 0.0178863
    chr1 33544953 33559286 uc001bxd.1 0 0 14333 0.0000000
    chr1 41748270 42157083 uc001cgz.2 281 7868 408813 0.0192460

    5th column is count. I have two questions:


    what is definition of properly-paired reads?

    When we calcualte count for each gene using properly-paired reads,
    whether we just count one time for each properly-paired reads?

    thanks,

    jlm

  • #2
    "Properly paired" means that, based on the size distribution and molecular strategy (i.e. paired-end v. mate-pair) of your library, the read-pair has aligned in the expected orientation (i.e. +/- for paired-end, -/+ for mate-pair) and the ends of the pair are an acceptable distance apart in the genome.

    I think if you are doing a comparative experiment, one could argue that it doesn't matter. However, if you want to just create one entry for each properly-paired read, you could use the -bedpe option (requires the BAM be sorted by name not position) to create one entry per pair. You could then cut out the leftmost position as the BED start and the rightmost position as the BED end.

    For example:

    Code:
    samtools view -bf 0x2 reads.<sortedByName>.bam | bamToBed -i stdin -bedpe | cut -f 1,2,6 > A.bed
    coverageBed -a A.bed -b hg18.bed.sorted -s > gene_count.txt
    Best,
    Aaron

    Comment


    • #3
      thanks,

      As you said, but I got this error:
      Error: malformed BED entry at line 41767. Coordinate detected that is < 0. Exiting.




      Originally posted by quinlana View Post
      "Properly paired" means that, based on the size distribution and molecular strategy (i.e. paired-end v. mate-pair) of your library, the read-pair has aligned in the expected orientation (i.e. +/- for paired-end, -/+ for mate-pair) and the ends of the pair are an acceptable distance apart in the genome.

      I think if you are doing a comparative experiment, one could argue that it doesn't matter. However, if you want to just create one entry for each properly-paired read, you could use the -bedpe option (requires the BAM be sorted by name not position) to create one entry per pair. You could then cut out the leftmost position as the BED start and the rightmost position as the BED end.

      For example:

      Code:
      samtools view -bf 0x2 reads.<sortedByName>.bam | bamToBed -i stdin -bedpe | cut -f 1,2,6 > A.bed
      coverageBed -a A.bed -b hg18.bed.sorted -s > gene_count.txt
      Best,
      Aaron

      Comment


      • #4
        Originally posted by jlfmssm View Post
        thanks,

        As you said, but I got this error:
        Error: malformed BED entry at line 41767. Coordinate detected that is < 0. Exiting.
        Have you looked at line 41767 in A.bed? Could you post it?
        Aaron

        Comment


        • #5
          Yes, here is A.bed:

          chr6_cox_hap1 2140627 2140761 IL26_1382:1:6:405:885 0 +
          chr8 74366566 74366678 IL26_1382:1:6:406:42 0 +
          chr11 10831430 10831500 IL26_1382:1:6:406:430 60 +
          chrM 161 263 IL26_1382:1:6:406:522 60 +
          chr3 99722880 99722972 IL26_1382:1:6:406:528 29 +
          chr6 74286405 74286482 IL26_1382:1:6:406:544 0 +
          chr1 27746538 27746648 IL26_1382:1:6:406:549 60 +
          . -1 81 IL26_1382:1:6:406:598 0 .chr8 144968164 144968266 IL26_1382:1:6:406:661 60 +
          chr14 105393057 105393150 IL26_1382:1:6:406:672 60 +



          Originally posted by quinlana View Post
          Have you looked at line 41767 in A.bed? Could you post it?
          Aaron

          Comment


          • #6
            [QUOTE=jlfmssm;21354]Yes, here is A.bed:

            chr6_cox_hap1 2140627 2140761 IL26_1382:1:6:405:885 0 +
            chr8 74366566 74366678 IL26_1382:1:6:406:42 0 +
            chr11 10831430 10831500 IL26_1382:1:6:406:430 60 +
            chrM 161 263 IL26_1382:1:6:406:522 60 +
            chr3 99722880 99722972 IL26_1382:1:6:406:528 29 +
            chr6 74286405 74286482 IL26_1382:1:6:406:544 0 +
            chr1 27746538 27746648 IL26_1382:1:6:406:549 60 +

            . -1 81 IL26_1382:1:6:406:598 0 .

            chr8 144968164 144968266 IL26_1382:1:6:406:661 60 +
            chr14 105393057 105393150 IL26_1382:1:6:406:672 60 +

            Comment


            • #7
              Originally posted by jlfmssm View Post
              Yes, here is A.bed:

              chr6_cox_hap1 2140627 2140761 IL26_1382:1:6:405:885 0 +
              chr8 74366566 74366678 IL26_1382:1:6:406:42 0 +
              chr11 10831430 10831500 IL26_1382:1:6:406:430 60 +
              chrM 161 263 IL26_1382:1:6:406:522 60 +
              chr3 99722880 99722972 IL26_1382:1:6:406:528 29 +
              chr6 74286405 74286482 IL26_1382:1:6:406:544 0 +
              chr1 27746538 27746648 IL26_1382:1:6:406:549 60 +
              . -1 81 IL26_1382:1:6:406:598 0 .chr8 144968164 144968266 IL26_1382:1:6:406:661 60 +
              chr14 105393057 105393150 IL26_1382:1:6:406:672 60 +
              This entry:

              Code:
              .       -1      81      IL26_1382:1:6:406:598   0       .
              suggests that you are feeding improperly paired alignments into bamToBed. When using the -bedpe option, when unmapped reads from a pair are encountered, it will report a "." for the chrom and strand. Are you sure you are only capturing properly-paired reads?

              Perhaps you could do the following and post the output?

              Code:
              samtools view -f 0x2 read.bam | grep "IL26_1382:1:6:406:598"

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              31 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Working...
              X