We are having a recent issue with RRBS libraries. I wonder if someone might shed some light on or have encountered a similar problem. The library preparation is fine, except that only 30% of all sequences start with CGG/TGG. The quality of the DNA is good and shows high molecular weight band on a both agarose and PAGE gels. We cut 500 ng of DNA using 50 U of MspI overnight at 37 C. After ligation with methylated adapters and two rounds of BS conversion, the ligated DNA is amplified using pfu Turbo Hotstart. I assume that you would expect a much greater proportion of reads beginning with CGG/TGG. Any suggestions on what we are missing?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
@kshankar
It is not uncommon to find reads that doesnt start with CGG or TGG. However, observing only 30% of reads with CGG or TGG as starting triplet is of concern. In our experience we also noticed around 10-15 % of reads that start with a different triplet. It could be because of the non-specificity of the enzyme or due to degraded or broken down DNA at the starting step. I noticed that you use 50U of MspI for just 500ng. This may be too much for this reaction. Generally we follow the Gu et al 2011 protocol, that works fine. This protocol uses 10U for 300ng. We have also tested with Fermentas FastDigest MspI (1ul for 500ng) for 2hrs. This also works fine for us! You may also refer to the recent paper by Altuna Akalin et 2012. They explained a similar kind of observation.
Comment
-
We have just released Bismark version 0.7.5. This version incorporates changes that were made to the latest release of Bowtie 2 (2.0.0-beta7):
- Trailing read ID segment numbers (e.g. /1,/2 or /3) are now removed internally for Bowtie 2 alignments in paired-end mode as this might have caused no reads to align at all if the segment number was not 1 or 2. As of Bowtie 2 version 2.0.0-beta7 this behavior has been disabled for unpaired reads
- The Bowtie 2 option -M is now deprecated (as of Bowtie 2 version 2.0.0-beta7). What used to be called -M mode is still the default mode, but adjusting the -M setting is deprecated. The options -D and -R should be used to adjust the effort expended to find valid alignments
- Changed the default seed mismatch parameter (controlled by -n) to 1 (down from 2). This increases alignment speed noticably and typically produces very similar results for good quality read data
- Fixed a bug where the chromosomal sequence could not be extracted for very short genomic sequences for alignments with Bowtie 2
- The methylation extractor and the Bismark alignment output deduplication script do now read both raw and gzipped (.gz) Bismark mapping files
Bismark is available for download from http://www.bioinformatics.babraham.a...jects/bismark/.Last edited by fkrueger; 07-16-2012, 04:27 AM.
Comment
-
Running the Bismark (bowtie2):
I got the following SAM file.
read1 0 chrPt 61863 255 16M1I33M * 0 0 GGTTTTATAAATGGTATTTTTTTGATATTGTATTTGAAGTAGTTGTTAAA FFFFHHGHHJJJJJCHHIJJJJJJIJJJJJIJJJJJJJJHIIIJJJJJIJ NM:i:11 XX:Z:4CC1C3C4NC2CC5C11C10 XM:Z:....hh.h...z.....h..hx.....x...........x.......... XR:Z:CT XG:Z:CT
and running the Bismark methylation_extractor with this SAM file.
I got the following results.
chrPt 61883 h and chrPt 61884 x
But the chromosome position is different from my expectation.
chrPt 61882 h and chrPt 61883 x
61863 |||||||0||||5|||-|0||||5||||0||||5||||0||||5||||0|
61863 GGTTCCACAAACGGTA-CTTCCTGATACTGTATTTGAAGCAGTTGTTAAA reference
61863 GGTTTTATAAATGGTATTTTTTTGATATTGTATTTGAAGTAGTTGTTAAA read
61863 ....hh.h...z.....h..hx.....x...........x..........
Can not the methylation_extractor treat an alignment including in/del?
Comment
-
Oddly, it appears that the answer is no, it can't. Looking at the copy of the methylation_extractor that I have, the CIGAR string isn't passed into, for example, the print_individual_C_methylation_states_paired_end_files function. So the address of the methylation calls is set to start+index, which will be off for reads containing indels. This is easy enough to fix by first parsing the CIGAR string and creating a nucleotide position array from it.
@Felix: As part of the methylation extractor I wrote to deal with my non-standard data I included the ability to deal with this issue (at least in the circumstances present in my data). I'd be happy to send you some code if it's helpful (though I wrote it in C, so it's probably easier to just write it from scratch in perl).
Comment
-
This is indeed a shortcoming of the methylation extractor which I have simply not thought about... I'll try to fix this as soon as possible. In the meantime, could someone please send me a few lines of Bismark (Bowtie2) alignment output with and without indels via email (just 10 lines will do)? I could make something up myself but I am currently not at work so it would be a lot easier.
@Devon: I wouldn't mind taking a look at your code, but maybe you are right and it's quicker to just write it quickly (it's too hot to go outside anyway...).
Thanks,
Felix
Comment
-
Dear Felix,
My colleague and I have been trying to run the bismark program and it has been driving us up the wall! Everything appears to proceed just fine up to reading in the fastq file (for example, it finds the two "preliminary" alignments) but then it seems to skip the actual "bismarking" and proceeds immediately to produce an empty results report:
Reading in the sequence file 010CA_KUEY_108_trim230.fastq
Processed 1000000 sequences so far
Processed 2000000 sequences so far
Processed 2115193 sequences in total
Successfully deleted the temporary file 010CA_KUEY_108_trim230.fastq_C_to_T.fastq
Final Alignment report
======================
Sequences analysed in total: 2115193
Number of alignments with a unique best hit from the different alignments: 0
Mapping efficiency: 0.0%
Sequences with no alignments under any condition: 2115193
Sequences did not map uniquely: 0
Sequences which were discarded because genomic sequence could not be extracted: 0
Number of sequences with unique best (first) alignment came from the bowtie output:
CT/CT: 0 ((converted) top strand)
CT/GA: 0 ((converted) bottom strand)
GA/CT: 0 (complementary to (converted) top strand)
GA/GA: 0 (complementary to (converted) bottom strand)
Number of alignments to (merely theoretical) complementary strands being rejected in total: 0
Final Cytosine Methylation Report
=================================
Total number of C's analysed: 0
Total methylated C's in CpG context: 0
Total methylated C's in CHG context: 0
Total methylated C's in CHH context: 0
Total C to T conversions in CpG context: 0
Total C to T conversions in CHG context: 0
Total C to T conversions in CHH context: 0
Can't determine percentage of methylated Cs in CpG context if value was 0
Can't determine percentage of methylated Cs in CHG context if value was 0
Can't determine percentage of methylated Cs in CHH context if value was 0
Not sure what we are doing wrong but the folowing is our command line:
bismark_v0.7.5/bismark -n 1 -l 20 --bowtie2 --path_to_bowtie /usr/local/bin AGTC_CLIENTS/CHEN/Rattus 010CA_KUEY_108_trim230.fastq
I'd appreciate any insights you might have.
Thanks.
Comment
-
-
Originally posted by yuggoth View PostRunning the Bismark (bowtie2):
I got the following SAM file.
read1 0 chrPt 61863 255 16M1I33M * 0 0 GGTTTTATAAATGGTATTTTTTTGATATTGTATTTGAAGTAGTTGTTAAA FFFFHHGHHJJJJJCHHIJJJJJJIJJJJJIJJJJJJJJHIIIJJJJJIJ NM:i:11 XX:Z:4CC1C3C4NC2CC5C11C10 XM:Z:....hh.h...z.....h..hx.....x...........x.......... XR:Z:CT XG:Z:CT
and running the Bismark methylation_extractor with this SAM file.
I got the following results.
chrPt 61883 h and chrPt 61884 x
But the chromosome position is different from my expectation.
chrPt 61882 h and chrPt 61883 x
61863 |||||||0||||5|||-|0||||5||||0||||5||||0||||5||||0|
61863 GGTTCCACAAACGGTA-CTTCCTGATACTGTATTTGAAGCAGTTGTTAAA reference
61863 GGTTTTATAAATGGTATTTTTTTGATATTGTATTTGAAGTAGTTGTTAAA read
61863 ....hh.h...z.....h..hx.....x...........x..........
Can not the methylation_extractor treat an alignment including in/del?
I have now spent quite some time adapting the methylation extractor to handle InDels correctly. I have done some testing here already where it seemed to work as expected but it is possible that I have missed something. Could you run the attached version on your file and see if it fixes your problems? If it does I'll release a new version as soon as possible.
Best,
FelixAttached Files
Comment
-
We have just released a new version of Bismark (v0.7.6). This version mainly fixes the way in which SAM files (both single and paired-end) are handled in the methylation extractor because reads containing insertions or deletion would result in slighlty offset methylation calls. Reads containing InDels, which may be generated by Bismark using Bowtie 2, are now handled as intended. Bismark users employing Bowtie 2 for alignments are strongly encouraged to upgrade to this version.
We have also changed the way in which the methylation extractor identifies the read and genome conversion flags in SAM output. This might become relevant if the Bismark SAM mapping output was compressed/decompressed with CRAM or Goby at some point, since these tools may change the order of optional tags in a SAM entry. Thanks to Z. Zeno for pointing this out and contributing a patch.
Bismark is available from here: http://www.bioinformatics.babraham.a...jects/bismark/ (you might have to force a cache update with Shift + refresh).
Comment
-
error with Prinseq output file to bismark?
I was wondering if anyone has used prinseq to trim fastq files then tried to put those fastq files into bismark for alignment and had any issues. I generated some trimmed fastq files and when I try and put those into bismark I keep getting the error of "no such file or directory" for the fastq file. The name tab completes in my command line so I know it is there and the name is correct. The fastq files that I generated can be uploaded into fastqc just fine. Just wondering if it is a formatting issue that I need to adjust. Any suggestions as how I can move forward would be great.
Comment
-
temp directory for bismarktobedgraph script
I am using the bismarktobedgraph script from the website. Because it doesn't have a temp directory option I am running it from the directory that has the space. It seems that it is running out of memory still and my standard error file says "sort: write failed: /tmp/sortMpgIOA: No space left on device". I am not sure why it is still writing to this directory. Is there something in the script that needs to be adjusted?
Comment
Latest Articles
Collapse
-
by seqadmin
Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...-
Channel: Articles
12-16-2024, 07:57 AM -
-
by seqadmin
Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.
Long-Read Sequencing
Long-read sequencing has seen remarkable advancements,...-
Channel: Articles
12-02-2024, 01:49 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 12-17-2024, 10:28 AM
|
0 responses
26 views
0 likes
|
Last Post
by seqadmin
12-17-2024, 10:28 AM
|
||
Started by seqadmin, 12-13-2024, 08:24 AM
|
0 responses
42 views
0 likes
|
Last Post
by seqadmin
12-13-2024, 08:24 AM
|
||
Started by seqadmin, 12-12-2024, 07:41 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
12-12-2024, 07:41 AM
|
||
Started by seqadmin, 12-11-2024, 07:45 AM
|
0 responses
42 views
0 likes
|
Last Post
by seqadmin
12-11-2024, 07:45 AM
|
Comment