Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Breakpointer, Predicting Structural Variation from SE Illumina Reads + Reference

    Hello all,

    I am having some difficulty predicting were the breakpoints of a known large-scale inversion (~4-10 Mb, inferred by population genetics and QTL mapping) are using part of a reference genome and single-end 100 bp Illumina reads from a population sample. One complication is that the reference scaffold that I am using may not contain both inversion breakpoints as it is incomplete. My questions are:

    1) I am currently using the Breakpointer script from (https://github.com/ruping/Breakpointer), is there a better program to predict structural variants using single-end reads?

    2) Does anyone have experience with the output file for Breakpointer run with the script at (https://github.com/ruping/Breakpointer)? Are there particular portions of this output that would be most informative for IDing inversion breakpoints as I have 17,468 candidate regions (eg. less reads map to that particular location and the read is split?)? The first few lines of the output looks like this:

    chrscaffold935|size76515 Breakpointer Depth-Skewed 63195 63245 56.233 + . ID=chrscaffold935|size76515:63195;SIZE=51;DEPTH=29;EndsRatio=0.662;StartsRatio=1;BinomialScore=3.004;MIS=7;realMIS=2;MISRATE=0.364621;seedseq=AAAAGTGTTACCTTTAAACCCCCCT;MismatchScore=0.0161;SU=13;rank_SB=62.810;rank_SM=49.657
    chrscaffold348|size845032 Breakpointer Depth-Skewed 126749 126805 26.985 + . ID=chrscaffold348|size845032:126749;SIZE=57;DEPTH=24;EndsRatio=0.577;StartsRatio=0.448;BinomialScore=1.903;MIS=5;realMIS=1;MISRATE=0.361063;seedseq=TCAGTTGATGGACGAAACCAATTTA;MismatchScore=0.00158;SU=9;rank_SB=35.475;rank_SM=18.494
    chrscaffold600|size417131 Breakpointer Depth-Skewed 146682 146753 77.008 + . ID=chrscaffold600|size417131:146682;SIZE=72;DEPTH=117;EndsRatio=0.68;StartsRatio=0.235;BinomialScore=4.775;MIS=20;realMIS=0;MISRATE=0.251383;seedseq=CCACTTGATTTTAGCGATTCTGCGG;MismatchScore=0.097;SU=24;rank_SB=82.567;rank_SM=71.449
    chrscaffold641|size112415 Breakpointer Depth-Skewed 10518 10545 49.758 + . ID=chrscaffold641|size112415:10518;SIZE=28;DEPTH=6;EndsRatio=1;StartsRatio=0.238;BinomialScore=9.119;MIS=2;realMIS=0;MISRATE=0.333333;seedseq=TGTTGTGACGTGTTGTTTCCGCGGC;MismatchScore=2e-09;SU=1;rank_SB=99.474;rank_SM=0.041
    chrscaffold150|size493303 Breakpointer Depth-Skewed 179029 179104 60.675 + . ID=chrscaffold150|size493303:179029;SIZE=76;DEPTH=595;EndsRatio=0.499;StartsRatio=0.471;BinomialScore=1.611;MIS=90;realMIS=3;MISRATE=0.303127;seedseq=TTATTGCTAATTTAAATAAGGTTTT;MismatchScore=2.67;SU=1;rank_SB=22.620;rank_SM=98.729
    chrscaffold218|size453784 Breakpointer Depth-Skewed 285154 285184 26.029 + . ID=chrscaffold218|size453784:285154;SIZE=31;DEPTH=40;EndsRatio=0.475;StartsRatio=0.476;BinomialScore=1.146;MIS=3;realMIS=0;MISRATE=0.157895;seedseq=ACCTTTAAAACTGTTTTTCTCTTAA;MismatchScore=0.0158;SU=13;rank_SB=2.611;rank_SM=49.447
    chrscaffold35|size722381 Breakpointer Depth-Skewed 659136 659214 63.288 + . ID=chrscaffold35|size722381:659136;SIZE=79;DEPTH=26;EndsRatio=0.736;StartsRatio=0.195;BinomialScore=2.9;MIS=9;realMIS=0;MISRATE=0.470318;seedseq=TACCTATACATTTCCTAGGATATGT;MismatchScore=0.0567;SU=4;rank_SB=61.264;rank_SM=65.311
    chrscaffold621|size428421 Breakpointer Depth-Skewed 119257 119308 24.223 + . ID=chrscaffold621|size428421:119257;SIZE=52;DEPTH=34;EndsRatio=0.487;StartsRatio=0.546;BinomialScore=1.381;MIS=5;realMIS=1;MISRATE=0.301969;seedseq=ATCGAAAAAGCTAAGGCTAAAAACC;MismatchScore=0.00676;SU=5;rank_SB=11.607;rank_SM=36.838
    chrscaffold48|size965936 Breakpointer Depth-Skewed 18878 18964 22.985 + . ID=chrscaffold48|size965936:18878;SIZE=87;DEPTH=29;EndsRatio=0.662;StartsRatio=0.429;BinomialScore=2.195;MIS=4;realMIS=1;MISRATE=0.208355;seedseq=CAATTATTTTGTAAATGTTTACACG;MismatchScore=2e-09;SU=4;rank_SB=45.925;rank_SM=0.046
    chrscaffold621|size428421 Breakpointer Depth-Skewed 143393 143427 70.720 + . ID=chrscaffold621|size428421:143393;SIZE=35;DEPTH=20;EndsRatio=0.852;StartsRatio=0.908;BinomialScore=5.829;MIS=3;realMIS=1;MISRATE=0.176056;seedseq=AATTTAATACAGGTACGACTGTACC;MismatchScore=0.0177;SU=24;rank_SB=90.552;rank_SM=50.887
    chrscaffold348|size845032 Breakpointer Depth-Skewed 24245 24294 78.876 + . ID=chrscaffold348|size845032:24245;SIZE=50;DEPTH=29;EndsRatio=0.818;StartsRatio=0.466;BinomialScore=5.002;MIS=14;realMIS=3;MISRATE=0.590169;seedseq=TCTATATTTTGGTGCAGTCCTGTTG;MismatchScore=0.113;SU=1;rank_SB=84.565;rank_SM=73.187
    chrscaffold50|size611115 Breakpointer Depth-Skewed 570718 570765 46.314 + . ID=chrscaffold50|size611115:570718;SIZE=48;DEPTH=9;EndsRatio=0.739;StartsRatio=0.805;BinomialScore=3.418;MIS=3;realMIS=0;MISRATE=0.45106;seedseq=CCTAATCCTATGTCCTTCTCCTGGC;MismatchScore=0.00275;SU=5;rank_SB=68.357;rank_SM=24.271
    chrscaffold502|size248588 Breakpointer Depth-Skewed 100735 100761 54.273 + . ID=chrscaffold502|size248588:100735;SIZE=27;DEPTH=8;EndsRatio=0.785;StartsRatio=1;BinomialScore=3.581;MIS=3;realMIS=1;MISRATE=0.477707;seedseq=TGGTTCTAGGCCCTAAATCGTTAAT;MismatchScore=0.00748;SU=4;rank_SB=70.241;rank_SM=38.306
    chrscaffold546|size598492 Breakpointer Depth-Skewed 475284 475375 30.852 + . ID=chrscaffold546|size598492:475284;SIZE=92;DEPTH=44;EndsRatio=0.549;StartsRatio=0.414;BinomialScore=1.776;MIS=6;realMIS=1;MISRATE=0.248385;seedseq=AAACATGTTTACATTATTATGGTAC;MismatchScore=0.00474;SU=5;rank_SB=29.983;rank_SM=31.720
    chrscaffold15|size673834 Breakpointer Depth-Skewed 313308 313396 73.223 + . ID=chrscaffold15|size673834:313308;SIZE=89;DEPTH=112;EndsRatio=0.688;StartsRatio=0.275;BinomialScore=4.532;MIS=11;realMIS=2;MISRATE=0.142753;seedseq=GCCTTAATCCACGCGAATTCGATGG;MismatchScore=0.0603;SU=19;rank_SB=80.435;rank_SM=66.011
    chrscaffold621|size428421 Breakpointer Depth-Skewed 380187 380235 36.980 + . ID=chrscaffold621|size428421:380187;SIZE=49;DEPTH=64;EndsRatio=0.5;StartsRatio=0.574;BinomialScore=1.609;MIS=3;realMIS=0;MISRATE=0.09375;seedseq=AATGGTTTAATGCCCGTTTTCACCA;MismatchScore=0.0185;SU=3;rank_SB=22.510;rank_SM=51.450
    chrscaffold189|size410278 Breakpointer Depth-Skewed 264356 264391 83.797 + . ID=chrscaffold189|size410278:264356;SIZE=36;DEPTH=52;EndsRatio=0.666;StartsRatio=0.958;BinomialScore=3.636;MIS=22;realMIS=5;MISRATE=0.635251;seedseq=TGTAAGACTAGCGGCCGCCCGCGAC;MismatchScore=1.5;SU=14;rank_SB=70.978;rank_SM=96.616
    chrscaffold155|size350779 Breakpointer Depth-Skewed 191704 191757 11.213 + . ID=chrscaffold155|size350779:191704;SIZE=54;DEPTH=18;EndsRatio=0.533;StartsRatio=0.433;BinomialScore=1.285;MIS=2;realMIS=0;MISRATE=0.208464;seedseq=GCGTAAGTCCGTTGATTGGGATCAT;MismatchScore=0.00115;SU=3;rank_SB=7.363;rank_SM=15.064

    Any help would be greatly appreciated!

Latest Articles

Collapse

  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM
  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
22 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
47 views
0 likes
Last Post seqadmin  
Working...
X