Seqanswers Leaderboard Ad

**zgene** · 02-15-2020, 06:05 AM

Hi rajitz,

I am trying to run the GermlineCNV caller and had problems in DetermineGermlineContigPloidy section.
Since I found out you have reached at least to the final step, I was wondering if you can give me some advice in this step.

I used the following command in this step.

/data/NGS/Reanalysis-Package/gatk-4.1.4.0/gatk -L Filtered_annotated_preprocessed_intervals_Twist.interval_list --interval-merging-rule OVERLAPPING_ONLY -I /data/NGS/Reanalysis-Package/CNV/HDF5-200/S1071Nr10.counts.hdf5 -I /data/NGS/Reanalysis-Package/CNV/HDF5-200/S1071Nr11.counts.hdf5 -I /data/NGS/Reanalysis-Package/CNV/HDF5-200/S1071Nr12.counts.hdf5 -I /data/NGS/Reanalysis-Package/CNV/HDF5-200/S1071Nr13.counts.hdf5 ( added 200 samples here as input, skipped those lines here to save the space) --contig-ploidy-priors /data/NGS/Reanalysis-Package/CNV/Bed-Files/contig_ploidy_priors.tsv --output . --output-prefix ploidy --verbosity DEBUG --mapping-error-rate 0.01 --global-psi-scale 0.001 --sample-psi-scale 1.0E-4 --mean-bias-standard-deviation 0.01

I installed the conda environment following https://gatk.broadinstitute.org/hc/e...de44460155fb6#

Everything was working until I got the following error, which I cannot understand what it is and how I can solve it.

The command was running properly until I got the following error:

16:54:47.473 DEBUG ScriptExecutor - --contig_ploidy_prior_table=/data/NGS/Reanalysis-Package/CNV/Bed-Files/contig_ploidy_priors.tsv
16:54:47.473 DEBUG ScriptExecutor - --output_model_path=/data/NGS/Reanalysis-Package/CNV/ploidy-model
/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "/tmp/cohort_determine_ploidy_and_depth.1941148667013278511.py", line 79, in <module>
args.contig_ploidy_prior_table)
File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/gcnvkernel/io/io_ploidy.py", line 182, in get_contig_ploidy_prior_map_from_tsv_file
delimiter=delimiter)
File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/gcnvkernel/io/io_commons.py", line 50, in read_csv
input_pd = pd.read_csv(fh, delimiter=delimiter, dtype=dtypes_dict) # dtypes_dict keys may not be present
File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/parsers.py", line 705, in parser_f
return _read(filepath_or_buffer, kwds)
File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/parsers.py", line 451, in _read
data = parser.read(nrows)
File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/parsers.py", line 1065, in read
ret = self._engine.read(nrows)
File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/parsers.py", line 1828, in read
data = self._reader.read(nrows)
File "pandas/_libs/parsers.pyx", line 894, in pandas._libs.parsers.TextReader.read
File "pandas/_libs/parsers.pyx", line 916, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas/_libs/parsers.pyx", line 970, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 957, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas/_libs/parsers.pyx", line 2200, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 5 fields in line 58, saw 7

16:54:55.812 DEBUG ScriptExecutor - Result: 1
16:54:55.813 INFO DetermineGermlineContigPloidy - Shutting down engine
[February 3, 2020 4:54:55 PM IRST] org.broadinstitute.hellbender.tools.copynumber.DetermineGermlineContigPloidy done. Elapsed time: 0.78 minutes.
Runtime.totalMemory()=3370123264
org.broadinstitute.hellbender.utils.python.PythonScriptExecutorException:
python exited with 1
Command Line: python /tmp/cohort_determine_ploidy_and_depth.1941148667013278511.py --sample_coverage_metadata=/tmp/samples-by-coverage-per-contig3314282489028474630.tsv --output_calls_path=/data/NGS/Reanalysis-Package/CNV/ploidy-calls --mapping_error_rate=1.000000e-02 --psi_s_scale=1.000000e-04 --mean_bias_sd=1.000000e-02 --psi_j_scale=1.000000e-03 --learning_rate=5.000000e-02 --adamax_beta1=9.000000e-01 --adamax_beta2=9.990000e-01 --log_emission_samples_per_round=2000 --log_emission_sampling_rounds=100 --log_emission_sampling_median_rel_error=5.000000e-04 --max_advi_iter_first_epoch=1000 --max_advi_iter_subsequent_epochs=1000 --min_training_epochs=20 --max_training_epochs=100 --initial_temperature=2.000000e+00 --num_thermal_advi_iters=5000 --convergence_snr_averaging_window=5000 --convergence_snr_trigger_threshold=1.000000e-01 --convergence_snr_countdown_window=10 --max_calling_iters=1 --caller_update_convergence_threshold=1.000000e-03 --caller_internal_admixing_rate=7.500000e-01 --caller_external_admixing_rate=7.500000e-01 --disable_caller=false --disable_sampler=false --disable_annealing=false --interval_list=/tmp/intervals2626211694091496982.tsv --contig_ploidy_prior_table=/data/NGS/Reanalysis-Package/CNV/Bed-Files/contig_ploidy_priors.tsv --output_model_path=/data/NGS/Reanalysis-Package/CNV/ploidy-model
at org.broadinstitute.hellbender.utils.python.PythonExecutorBase.getScriptException(PythonExecutorBase.java:75)
at org.broadinstitute.hellbender.utils.runtime.ScriptExecutor.executeCuratedArgs(ScriptExecutor.java:126)
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.executeArgs(PythonScriptExecutor.java:170)
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.executeScript(PythonScriptExecutor.java:151)
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.executeScript(PythonScriptExecutor.java:121)
at org.broadinstitute.hellbender.tools.copynumber.DetermineGermlineContigPloidy.executeDeterminePloidyAndDepthPythonScript(DetermineGermlineContigPloidy.java:411)
at org.broadinstitute.hellbender.tools.copynumber.DetermineGermlineContigPloidy.doWork(DetermineGermlineContigPloidy.java:288)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)

So, it seems that the error is;

pandas.errors.ParserError: Error tokenizing data. C error: Expected 5 fields in line 58, saw 7

I googled a lot but I could not figure out what the problem is ( I have no experience working with python, I am just following the steps in here; https://gatkforums.broadinstitute.or...scussion/11684

I am looking forward to hearing from you or anyone else with experience in this.

Cheers,
Zohreh

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

GATK GermlineCNVCaller & PostprocessGermlineCNVCalls

Comment

Latest Articles

ad_right_rmr

News