Dear all,
I have a paired-end RRBS dataset from mouse and I am a bit puzzled since the M-bias plots show weird peaks especially on read-2. I would like to ask your opinion whether I should be considering a different approach on my RRBS analysis.
I have also attached a png file of bismark2report file which might help you to understand my problem in details. I should also note that, I have 12 libraries and they all have the same characteristics.
Questions
The percentage of non-CpG (CHG and CHH) methylated cytosines I observe is ~5-6%. As far as I understand, this can be interpreted as the bisulfite conversion efficiency if at least 94-95%.
[Question 1]: Is this a bad efficiency? Would you rather do not proceed with the analysis of a library of this many non-CpG methylation?
Regarding to Read-1, M-bias plot show a fairly stable distribution of CpG methylation across all different positions except the first 3 bases.
[Question 2]: However, there are some weird spikes for CHG (14 bp) and CHH (24, 34 bp) methylation. Why do you think these anomalies exist?
More interestingly, Read-2 has a big spike on 10th bp for CpG methylation and a huge methylation increase in the 3' end while still have different spikes on different positions for CHG and CHH methylation.
[Question 3]: Why is there a methylation increase on 3' end of the Read-2? Is it due to end-repair reaction?
[Question 4]: Do you have an explanation of the methylation spike on the 10th bp of Read-2? Shall I trim the reads until I get rid of the spike on the 10th position?
[Question 5]: More importantly, would you confidently use this RRBS dataset? Is there any steps, diagnostics and considerations that you would recommend?
You can find detailed information below about the library and the pipeline I followed:
Library
Sequencing type: Paired-end RRBS (Reduced Representation Bisulfite Sequencing)
Sequencer: Illumina Nextseq 500
Organism: Mouse
Pipeline
1. Reads are trimmed using trim_galore with "--rrbs" and "--paired-end" options.
2. Trimmed reads were mapped to mouse genome by bismark bisulfite mapper using default settings.
3. Methylation information for individual cytosines were extracted by bismark_methylation_extractor using default settings.
Thank you so much in advance for your help and time.
I have a paired-end RRBS dataset from mouse and I am a bit puzzled since the M-bias plots show weird peaks especially on read-2. I would like to ask your opinion whether I should be considering a different approach on my RRBS analysis.
I have also attached a png file of bismark2report file which might help you to understand my problem in details. I should also note that, I have 12 libraries and they all have the same characteristics.
Questions
The percentage of non-CpG (CHG and CHH) methylated cytosines I observe is ~5-6%. As far as I understand, this can be interpreted as the bisulfite conversion efficiency if at least 94-95%.
[Question 1]: Is this a bad efficiency? Would you rather do not proceed with the analysis of a library of this many non-CpG methylation?
Regarding to Read-1, M-bias plot show a fairly stable distribution of CpG methylation across all different positions except the first 3 bases.
[Question 2]: However, there are some weird spikes for CHG (14 bp) and CHH (24, 34 bp) methylation. Why do you think these anomalies exist?
More interestingly, Read-2 has a big spike on 10th bp for CpG methylation and a huge methylation increase in the 3' end while still have different spikes on different positions for CHG and CHH methylation.
[Question 3]: Why is there a methylation increase on 3' end of the Read-2? Is it due to end-repair reaction?
[Question 4]: Do you have an explanation of the methylation spike on the 10th bp of Read-2? Shall I trim the reads until I get rid of the spike on the 10th position?
[Question 5]: More importantly, would you confidently use this RRBS dataset? Is there any steps, diagnostics and considerations that you would recommend?
You can find detailed information below about the library and the pipeline I followed:
Library
Sequencing type: Paired-end RRBS (Reduced Representation Bisulfite Sequencing)
Sequencer: Illumina Nextseq 500
Organism: Mouse
Pipeline
1. Reads are trimmed using trim_galore with "--rrbs" and "--paired-end" options.
2. Trimmed reads were mapped to mouse genome by bismark bisulfite mapper using default settings.
3. Methylation information for individual cytosines were extracted by bismark_methylation_extractor using default settings.
Thank you so much in advance for your help and time.
Comment