![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Mpileup/BCFtools pipeline not picking up indels (suggestions please) | cam.jack | Bioinformatics | 7 | 05-17-2013 02:05 PM |
Planning a cancer exome sequencing project | sadiqsaleem09 | Genomic Resequencing | 6 | 05-09-2011 09:38 PM |
Looking for access to Illumina for RAD sequencing project | tlking | Illumina/Solexa | 0 | 11-02-2009 09:05 AM |
Volunteers wanted! Sequencing Quality Control Project (SEQC) | Joann | Events / Conferences | 2 | 10-09-2009 04:24 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: St. Louis Join Date: Jun 2011
Posts: 6
|
![]()
Hello All,
My lab has gotten two runs of mouse genome sequences from two genomes. Both runs are paired end sequenced, once with 36bp and once with 100bp. Unfortunately the people from the sequencing core who actually ran the sequences are unresponsive and the people from my lab who coordinated with them are long gone. Now I am trying to pick up the sequencing project. So far there are SNPs and InDels from samtools, and SVs from GASV and BreakDancer. BreakDancer and samtools were run by others so I am not sure what parameters they received. I have several issues that I am looking for help on: (1) I checked the deletions called by Breakdancer and GASV using the samtools sequence viewer, and found that reads often map in the called deletions, but have a lower quality. Does anyone have any sanity check suggestions for working with SVs from GASV and BreakDancer? (2) I suspect that some of the deletions are due to transposable element insertions in the reference and vice versa. I would like to find the transposable element insertions, but don't know of any tools out there for doing this. Do you guys know of any? If not, does anyone have a suggestion for how to pull out of BAM files only the paired reads with one end mapped? --> This last part is what I am struggling with because I don't even know how the BAM files were made and what was included. Also I heard that sometimes unmapped reads get the same coordinate as the mate, would this hurt my situation or is there a flag that I could use? These are just two of my most pressing troubles, but please let me know if you have any suggestions. Thank you in advance for any help. Last edited by giror; 06-17-2011 at 09:44 AM. |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: Bethesda MD Join Date: Oct 2009
Posts: 509
|
![]()
Transposon mapping with paired-end reads is straightforward.
1) Create a reference file that contains the sequence of each transposon. 2) Align read one and read two separately to the transposon reference. 3) Align read one and read two separately to the genome reference, using repeat masking (so you won't align to transposons). 4) Filter the read one genome alignments with the read two transposon alignments, using the unique read identifier. 5) Repeat with read two genome and read one transposon alignments. There are more sophisticated strategies, but this works relatively well given adequate read depth. -Harold |
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: St. Louis Join Date: Jun 2011
Posts: 6
|
![]()
This is generally the strategy I imagined. Unfortunately I am on an 8gig ram mac with a terabyte HD and I am not sure I could efficiently read through the entire BAM files which are 51 and 71 GB. The reads have already been mapped back to the genome, but I'm not sure of the parameters that were used. Do you know of a way I could get this information from the BAM?
If not, could you recommend an alignment program given the hardware constraints that I am under? Last edited by giror; 06-17-2011 at 11:36 AM. |
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: Bethesda MD Join Date: Oct 2009
Posts: 509
|
![]()
The approach I suggested would almost certainly require repeating the alignments. I don't know which aligner was used to generate your existing dataset, but the repeats were either masked (yielding no matches) or not (multiple matches). Most aligners return the unique matches so, either way, the transposon reads would be missing.
Our aligners run on a server cluster, so I can't offer any software recommendations for your system. A cloud solution might be your best option. |
![]() |
![]() |
![]() |
Tags |
bed, samtools, structure variation, transposable element |
Thread Tools | |
|
|