Seqanswers Leaderboard Ad

**bbm** · 03-07-2016, 02:20 PM

I'm trying to do the genome conversion with this line:

bismark_genome_preparation --bowtie2 --verbose /gpfs_share/hlibyar/hlibyar/genome_files/ --path_to_bowtie /usr/local/apps/bowtie2/

"...
Step II - Genome bisulfite conversions - completed

Bismark Genome Preparation - Step III: Launching the Bowtie 2 indexer
Please be aware that this process can - depending on genome size - take several hours!

Preparing indexing of CT converted genome in /gpfs_share/hlibyar/hlibyar/genome_files/Bisulfite_Genome/CT_conversion/
Parent process: Starting to index C->T converted genome with the following command:

/usr/local/apps/bowtie2/bowtie2-build -f genome_mfa.CT_conversion.fa BS_CT

Can't exec "/usr/local/apps/bowtie2/bowtie2-build": No such file or directory at /usr/local/apps/bismark/v0.14.5/bismark_genome_preparation line 163, <IN> line 3352281.
Preparing indexing of GA converted genome in /gpfs_share/hlibyar/hlibyar/genome_files/Bisulfite_Genome/GA_conversion/
Child process: Starting to index G->A converted genome with the following command:

/usr/local/apps/bowtie2/bowtie2-build -f genome_mfa.GA_conversion.fa BS_GA

(starting in 10 seconds)
Can't exec "/usr/local/apps/bowtie2/bowtie2-build": No such file or directory at /usr/local/apps/bismark/v0.14.5/bismark_genome_preparation line 178, <IN> line 3352281."

Is there sth wrong with my bowtie path?
Thanks.

**fkrueger** · 03-07-2016, 02:27 PM

Hmm, when you type:

Code:

/usr/local/apps/bowtie2/bowtie2-build

on the command line you need to see the bowtie2 indexing options. Does that that happen? If not you need to supply the exact path to where the executable is...

**bbm** · 03-07-2016, 02:35 PM

Originally posted by fkrueger View Post

Hmm, when you type:

Code:

/usr/local/apps/bowtie2/bowtie2-build

on the command line you need to see the bowtie2 indexing options. Does that that happen? If not you need to supply the exact path to where the executable is...

I see, let me try

**chxu02** · 03-08-2016, 07:10 AM

Hi Felix,
I'm having a special need. I indexed each alignment in my BAM file as an extra column. Is it feasible to also print this information into the .txt file when I run bismark_methylation_extractor? Basically I want to know which alignment each cytosine is from, and group the cytosines from the same alignment together.

Cheers,
Youyou

**fkrueger** · 03-08-2016, 07:16 AM

Originally posted by chxu02 View Post

Hi Felix,
I'm having a special need. I indexed each alignment in my BAM file as an extra column. Is it feasible to also print this information into the .txt file when I run bismark_methylation_extractor? Basically I want to know which alignment each cytosine is from, and group the cytosines from the same alignment together.

Cheers,
Youyou

Hi Youyou. The methylation extractor output still has the read ID printed in each line, so at this stage it is still possible to tell which C came from which read. If you proceed to the bedGraph stage or beyond this information will be lost unfortunately.

**bbm** · 03-10-2016, 07:58 AM

Hi Felix,

I have four lanes of data for each biological sample. Should I add them together before trimming and Bismark? Or is it better to do the trimming and Bismark run on the individual lane of reads? Thanks.

Regards,
BBM

**fkrueger** · 03-10-2016, 08:06 AM

Originally posted by bbm View Post

Hi Felix,

I have four lanes of data for each biological sample. Should I add them together before trimming and Bismark? Or is it better to do the trimming and Bismark run on the individual lane of reads? Thanks.

Regards,
BBM

For me personally merging them before mapping is the preferred way because this enables you to do the deduplication a single sample and it is just more convenient overall. If time is of the essence you could also align them separately and merge them before the deduplication, but that is really a matter of taste (also if you have paired-end sequences and use samtools merge for merging BAM files you need to make sure to use samtools sort -n before trying to deduplicate because samtools sort does not guarantee to keep mates together).

**akramdi** · 03-15-2016, 08:08 AM

Hi Felix,

I have recently started using Bismark. While going through the detailed log, I could not get the same pourcentages of methylated cytosines (before and after extraction) as the one shown in the report. Here's my cytosine methylation report before extraction to illustrate:

Final Cytosine Methylation Report
=================================
Total number of C's analysed: 928829076

Total methylated C's in CpG context: 43871230
Total methylated C's in CHG context: 16835845
Total methylated C's in CHH context: 25330412
Total methylated C's in Unknown context: 58

Total unmethylated C's in CpG context: 106252080
Total unmethylated C's in CHG context: 136818631
Total unmethylated C's in CHH context: 599720878
Total unmethylated C's in Unknown context: 408

C methylated in CpG context: 29.2%
C methylated in CHG context: 11.0%
C methylated in CHH context: 4.1%
C methylated in Unknown context (CN or CHN): 12.4%

For me, Total methylated C's in CpG context = (43871230/928829076)*100 = 4.72% (in the report, I see 29.2%)

What am I missing??

Cheers,

Amira

**fkrueger** · 03-15-2016, 08:13 AM

Hi Amira,

For C in CpG context you need to use only the Cs methylated and unmethylated in CpG context, and not all Cs found in total. So here it would be:

Total methylated C's in CpG context = (43871230/106252080)*100 = 29.2%.

Cheers, Felix

**akramdi** · 03-15-2016, 08:55 AM

Originally posted by fkrueger View Post

Hi Amira,

For C in CpG context you need to use only the Cs methylated and unmethylated in CpG context, and not all Cs found in total. So here it would be:

Total methylated C's in CpG context = (43871230/106252080)*100 = 29.2%.

Cheers, Felix

Thank you for the quick reply!

Okay, so it's relative to the context. In your illustration, I think you meant:

Total methylated C's in CpG context = (43871230/ (106252080+ 43871230) )*100 = 29.2%.

Thanks again,
Amira

**fkrueger** · 03-15-2016, 08:57 AM

Thanks for spotting that, and yes you are absolutely right it needs to be methylated / (methylated + unmethylated) *100.

**hsiehph** · 03-15-2016, 11:19 PM

Originally posted by fkrueger View Post

Hi Dipro,

This was indeed a typo which will be fixed in the next release which is actually due out today or tomorrow (and will finally support parallel alignments – so stay tuned!).

A couple of things about the command you used:

bismark_methylation_extractor -s -o --samtools_path --bedGraph --counts --remove_spaces --buffer_size --cytosine_report --genome_folder

'Failed to read from file /path/to/file_fq.gz_bismark_bt2.bismark.cov: No such file or directory'
Sorry if it is a stupid question, but did you change the ‘/path/to/file’ by a valid path of the file on your system?

-s: not necessary (will be determined automatically)
-o /requires/path/to/output/folder
--samtools_path /requires/path/to/samtools/executable
--counts: not necessary (used by default)
--remove_spaces: only use this if really necessary, will otherwise cost time and temporary space
--buffer_size: requires input, e.g. 10G
--genome_folder /requires/path/to/genome/folder

input file is required

If you still struggle can you just send me the onscreen-text via email? This would make spotting mistakes in the command much easier. Cheers, Felix

I also have a similar error message:
gzip: output_folder/input.bismark.cov.gz: No such file or directory
No last chromosome was defined, something must have gone wrong while reading the data in (e.g. specified wrong file path for a gzipped coverage file?). Please check your command!

However, I can see the input_folder.bismark.cov.gz file exist in my output_folder/.

The command I used as follows:
bismark_methylation_extractor -p --no_overlap --bedGraph --counts --buffer_size 10G --cytosine_report --CX --split_by_chromosome -o output_folder/ --genome_folder genome_bowtie1/ --multicore 6 input_folder/input_folder.sam

Interestingly, when I removed the -o output_folder/ (i.e. to the current directory), the script will finish properly. Does anyone have similar experience? The version of bismark is the latest one. Thanks for help.

**fkrueger** · 03-16-2016, 02:03 AM

Hi hsiehph,

I believe this problem has been fixed by now in this issue. You can get the latest development version of Bismark by cloning it from Github.

**hsiehph** · 03-16-2016, 05:46 PM

Originally posted by fkrueger View Post

Hi hsiehph,

I believe this problem has been fixed by now in this issue. You can get the latest development version of Bismark by cloning it from Github.

Thanks Felix. I will test it with the latest development version.

**johnstonL** · 04-06-2016, 07:46 AM

Hi Felix,

I am working on analyzing some pair ended non-directional RRBS libraries with Bismark and have come across a few confusions. I begin by trimming the pair end files with Trim Galore (making use of the -rrbs -non-directional options and everything else default) and then aligning with bismark. What strikes me the most from the alignment report is that I get a strange ratio of mapped reads between OT, OB, CTOT, CTOB:

Final Alignment report
======================
Sequence pairs analysed in total: 12664432
Number of paired-end alignments with a unique best hit: 4529206
Mapping efficiency: 35.8%
Sequence pairs with no alignments under any condition: 5506764
Sequence pairs did not map uniquely: 2628462
Sequence pairs which were discarded because genomic sequence could not be extracted: 0

Number of sequence pairs with unique best (first) alignment came from the bowtie output:
CT/GA/CT: 608018 ((converted) top strand)
GA/CT/CT: 1626505 (complementary to (converted) top strand)
GA/CT/GA: 1673560 (complementary to (converted) bottom strand)
CT/GA/GA: 621123 ((converted) bottom strand)

Final Cytosine Methylation Report
=================================
Total number of C's analysed: 117899629

Total methylated C's in CpG context: 7232502
Total methylated C's in CHG context: 362290
Total methylated C's in CHH context: 1218612
Total methylated C's in Unknown context: 2

Total unmethylated C's in CpG context: 13074777
Total unmethylated C's in CHG context: 29946985
Total unmethylated C's in CHH context: 66064463
Total unmethylated C's in Unknown context: 3

C methylated in CpG context: 35.6%
C methylated in CHG context: 1.2%
C methylated in CHH context: 1.8%
C methylated in unknown context (CN or CHN): 40.0%

Also, the way the library was given to me is that each sample is split into 2 pairs (sample1_pair1_forward, sample1_pair1_reverse, sample1_pair2_forward, sample1_pair2_reverse). If I wanted to run the methylation extractor on the samples, would it be a problem if I simply gave it the concatenated output of the alignment pairs?

i.e
sample1_pair1_aligned + sample1_pair2_aligned > sample1_aligned
methylation extractor sample1_aligned ...other samples

Thanks

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News