pyemu package

Subpackages

Submodules

pyemu.en module

class pyemu.en.Ensemble(pst, df, istransformed=False)

Bases: object

based class for handling ensembles of numeric values

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • df (pandas.DataFrame) – a pandas dataframe. Columns should be parameter/observation names. Index is treated as realization names

  • istransformed (bool) – flag to indicate parameter values are in log space. Not used for ObservationEnsemble

Example:

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)
as_pyemu_matrix(typ=None)

get a pyemu.Matrix instance of Ensemble

Parameters:

typ (pyemu.Matrix or pyemu.Cov) – the type of matrix to return. Default is pyemu.Matrix

Returns:

a matrix instance

Return type:

pyemu.Matrix

Example:

oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst=pst,num_reals=100)
dev_mat = oe.get_deviations().as_pyemu_matrix(typ=pyemu.Cov)
obscov = dev_mat.T * dev_mat
back_transform()

back transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

Ensemble.back_transform() is only provided for inheritance purposes. It only changes the `Ensemble._transformed flag

copy()

get a copy of Ensemble

Returns:

copy of this Ensemble

Return type:

Ensemble

Note

copies both Ensemble.pst and Ensemble._df: can be expensive

covariance_matrix(localizer=None, center_on=None)

get a empirical covariance matrix implied by the correlations between realizations

Parameters:
  • localizer (pyemu.Matrix, optional) – a matrix to localize covariates in the resulting covariance matrix. Default is None

  • center_on (str, optional) – a realization name to use as the centering point in ensemble space. If None, the mean vector is treated as the centering point. Default is None

Returns:

the empirical (and optionally localized) covariance matrix

Return type:

pyemu.Cov

dropna(*args, **kwargs)

override of pandas.DataFrame.dropna()

Parameters:
  • ([object] (*args) – positional arguments to pass to pandas.DataFrame.dropna().

  • ({str (**kwargs) – object}): keyword arguments to pass to pandas.DataFrame.dropna().

classmethod from_binary(pst, filename)

create an Ensemble from a PEST-style binary file

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • filename (str) – filename containing binary ensemble

Returns:

the ensemble loaded from the binary file

Return type:

Ensemble

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_binary("obs.jcb")
classmethod from_csv(pst, filename, *args, **kwargs)

create an Ensemble from a CSV file

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • filename (str) – filename containing CSV ensemble

  • ([object] (*args) – positional arguments to pass to pandas.read_csv().

  • ({str (**kwargs) – object}): keyword arguments to pass to pandas.read_csv().

Returns:

Ensemble

Note

uses pandas.read_csv() to load numeric values from CSV file

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_csv("obs.csv")
classmethod from_dataframe(pst, df, istransformed=False)
get_deviations(center_on=None)

get the deviations of the realizations around a certain point in ensemble space

Parameters:

center_on (str, optional) – a realization name to use as the centering point in ensemble space. If None, the mean vector is treated as the centering point. Default is None

Returns:

an ensemble of deviations around the centering point

Return type:

Ensemble

Note

deviations are the Euclidean distances from the center_on value to realized values for each column

center_on=None yields the classic ensemble smoother/ensemble Kalman filter deviations from the mean vector

Deviations respect log-transformation status.

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.add_base()
oe_dev = oe.get_deviations(center_on="base")
oe.to_csv("obs_base_devs.csv")
property istransformed

the parameter transformation status

Returns:

flag to indicate whether or not the ParameterEnsemble is transformed with respect to log_{10}. Not used for (and has no effect on) ObservationEnsemble.

Return type:

bool

Note

parameter transformation status is only related to log_{10} and does not include the effects of scale and/or offset

plot(bins=10, facecolor='0.5', plot_cols=None, filename='ensemble.pdf', func_dict=None, **kwargs)

plot ensemble histograms to multipage pdf

Parameters:
  • bins (int) – number of bins for the histograms

  • facecolor (str) – matplotlib color (e.g. r,`g`, etc)

  • plot_cols ([str]) – list of subset of ensemble columns to plot. If None, all are plotted. Default is None

  • filename (str) – multipage pdf filename. Default is “ensemble.pdf”

  • func_dict (dict) – a dict of functions to apply to specific columns. For example: {“par1”: np.log10}

  • **kwargs (dict) – addkeyword args to pass to pyemu.plot_utils.ensemble_helper()

Example:

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)
pe.transform() # plot log space (if needed)
pe.plot(bins=30)
pst

control file instance

Type:

pyemu.Pst

static reseed()

reset the numpy.random.seed

Note

reseeds using the pyemu.en.SEED global variable

The pyemu.en.SEED value is set as the numpy.random.seed on import, so make sure you know what you are doing if you call this method…

to_binary(filename)

write Ensemble to a PEST-style binary file

Parameters:

filename (str) – file to write

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_binary("obs.jcb")

Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

to_csv(filename, *args, **kwargs)

write Ensemble to a CSV file

Parameters:
  • filename (str) – file to write

  • ([object] (*args) – positional arguments to pass to pandas.DataFrame.to_csv().

  • ({str (**kwargs) – object}): keyword arguments to pass to pandas.DataFrame.to_csv().

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_csv("obs.csv")

Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

to_dense(filename)

write Ensemble to a dense-format binary file

Parameters:

filename (str) – file to write

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_dense("obs.bin")

Note

back transforms ParameterEnsemble before writing so that values are in arithmatic space

transform()

transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

Ensemble.transform() is only provided for inheritance purposes. It only changes the `Ensemble._transformed flag

class pyemu.en.Iloc(ensemble)

Bases: object

thin wrapper around pandas.DataFrame.iloc to make sure returned type is Ensemble (instead of pandas.DataFrame)

Parameters:

ensemble (pyemu.Ensemble) – an ensemble instance

Note

Users do not need to mess with this class - it is added to each Ensemble instance

class pyemu.en.Loc(ensemble)

Bases: object

thin wrapper around pandas.DataFrame.loc to make sure returned type is Ensemble (instead of pandas.DataFrame)

Parameters:

ensemble (pyemu.Ensemble) – an ensemble instance

Note

Users do not need to mess with this class - it is added to each Ensemble instance

class pyemu.en.ObservationEnsemble(pst, df, istransformed=False)

Bases: Ensemble

Observation noise ensemble in the PEST(++) realm

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • df (pandas.DataFrame) – a pandas dataframe. Columns should be observation names. Index is treated as realization names

  • istransformed (bool) – flag to indicate parameter values are in log space. Not used for ObservationEnsemble

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
add_base()

add the control file obsval values as a realization

Note

replaces the last realization with the current ObservationEnsemble.pst.observation_data.obsval values as a new realization named “base”

the PEST++ enemble tools will add this realization also if you dont wanna fool with it here…

classmethod from_gaussian_draw(pst, cov=None, num_reals=100, by_groups=True, fill=False, factor='eigen')

generate an ObservationEnsemble from a (multivariate) gaussian distribution

Parameters:
  • pst (pyemu.Pst) – a control file instance.

  • cov (pyemu.Cov) – a covariance matrix describing the second moment of the gaussian distribution. If None, cov is generated from the non-zero-weighted observation weights in pst. Only observations listed in cov are sampled. Other observations are assigned the obsval value from pst.

  • num_reals (int) – number of stochastic realizations to generate. Default is 100

  • by_groups (bool) – flag to generate realzations be observation group. This assumes no correlation (covariates) between observation groups.

  • fill (bool) – flag to fill in zero-weighted observations with control file values. Default is False.

  • factor (str) – how to factorize cov to form the projectin matrix. Can be “eigen” or “svd”. The “eigen” option is default and is faster. But for (nearly) singular cov matrices (such as those generated empirically from ensembles), “svd” is the only way. Ignored for diagonal cov.

Returns:

the realized ObservationEnsemble instance

Return type:

ObservationEnsemble

Note

Only observations named in cov are sampled. Additional, cov is processed prior to sampling to only include non-zero-weighted observations depending on the value of fill. So users must take care to make sure observations have been assigned non-zero weights even if cov is being passed

The default cov is generated from pyemu.Cov.from_observation_data, which assumes observation noise standard deviations are the inverse of the weights listed in pst

Example:

pst = pyemu.Pst("my.pst")
# the easiest way - just relying on weights in pst
oe1 = pyemu.ObservationEnsemble.from_gaussian_draw(pst)

# generate the cov explicitly
cov = pyemu.Cov.from_observation_data(pst)
oe2 = pyemu.ObservationEnsemble.from_gaussian_draw(pst,cov=cov)

# give all but one observation zero weight.  This will
# result in an oe with only one randomly sampled observation noise
# vector since the cov is processed to remove any zero-weighted
# observations before sampling
pst.observation_data.loc[pst.nnz_obs_names[1:],"weight] = 0.0
oe3 = pyemu.ObservationEnsemble.from_gaussian_draw(pst,cov=cov)
get_phi_vector(noise_obs_filename=None, noise_obs_flag=False)
property nonzero

get a new ObservationEnsemble of just non-zero weighted observations

Returns:

non-zero weighted observation ensemble.

Return type:

ObservationEnsemble

Note

The pst attribute of the returned ObservationEnsemble also only includes non-zero weighted observations (and is therefore not valid for running with PEST or PEST++)

property phi_vector

vector of L2 norm (phi) for the realizations (rows) of Ensemble.

Returns:

series of realization name (Ensemble.index) and phi values

Return type:

pandas.Series

Note

The ObservationEnsemble.pst.weights can be updated prior to calling this method to evaluate new weighting strategies

class pyemu.en.ParameterEnsemble(pst, df, istransformed=False)

Bases: Ensemble

Parameter ensembles in the PEST(++) realm

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • df (pandas.DataFrame) – a pandas dataframe. Columns should be parameter names. Index is treated as realization names

  • istransformed (bool) – flag to indicate parameter values are in log space (if partrans is “log” in pst)

Example:

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)
add_base()

add the control file obsval values as a realization

Note

replaces the last realization with the current ParameterEnsemble.pst.parameter_data.parval1 values as a new realization named “base”

The PEST++ ensemble tools will add this realization also if you dont wanna fool with it here…

property adj_names

the names of adjustable parameters in ParameterEnsemble

Returns:

adjustable parameter names

Return type:

[str]

back_transform()

back transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

enforce(how='reset', bound_tol=0.0)

entry point for bounds enforcement.

Parameters:

enforce_bounds (str) – can be ‘reset’ to reset offending values or ‘drop’ to drop offending realizations. Default is “reset”

Note

In very high dimensions, the “drop” and “scale” how types will result in either very few realizations or very short realizations.

Example:

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw()
pe.enforce(how="scale")
pe.to_csv("par.csv")
property fixed_indexer

boolean indexer for non-adjustable parameters

Returns:

boolean array indicating which parameters have partrans equal to “log” or “fixed”

Return type:

numpy.ndarray(bool)

classmethod from_gaussian_draw(pst, cov=None, num_reals=100, by_groups=True, fill=True, factor='eigen')

generate a ParameterEnsemble from a (multivariate) (log) gaussian distribution

Parameters:
  • pst (pyemu.Pst) – a control file instance.

  • cov (pyemu.Cov) – a covariance matrix describing the second moment of the gaussian distribution. If None, cov is generated from the bounds of the adjustable parameters in pst. the (log) width of the bounds is assumed to represent a multiple of the parameter standard deviation (this is the sigma_range argument that can be passed to pyemu.Cov.from_parameter_data).

  • num_reals (int) – number of stochastic realizations to generate. Default is 100

  • by_groups (bool) – flag to generate realizations be parameter group. This assumes no correlation (covariates) between parameter groups. For large numbers of parameters, this help prevent memories but is slower.

  • fill (bool) – flag to fill in fixed and/or tied parameters with control file values. Default is True.

  • factor (str) – how to factorize cov to form the projection matrix. Can be “eigen” or “svd”. The “eigen” option is default and is faster. But for (nearly) singular cov matrices (such as those generated empirically from ensembles), “svd” is the only way. Ignored for diagonal cov.

Returns:

the parameter ensemble realized from the gaussian distribution

Return type:

ParameterEnsemble

Note

Only parameters named in cov are sampled. Missing parameters are assigned values of pst.parameter_data.parval1 along the corresponding columns of ParameterEnsemble according to the value of fill.

The default cov is generated from pyemu.Cov.from_observation_data, which assumes parameter bounds in ParameterEnsemble.pst represent some multiple of parameter standard deviations. Additionally, the default Cov only includes adjustable parameters (partrans not “tied” or “fixed”).

“tied” parameters are not sampled.

Example:

pst = pyemu.Pst("my.pst")
# the easiest way - just relying on weights in pst
pe1 = pyemu.ParameterEnsemble.from_gaussian_draw(pst)

# generate the cov explicitly with a sigma_range
cov = pyemu.Cov.from_parameter_data(pst,sigma_range=6)
[e2 = pyemu.ParameterEnsemble.from_gaussian_draw(pst,cov=cov)
classmethod from_mixed_draws(pst, how_dict, default='gaussian', num_reals=100, cov=None, sigma_range=6, enforce_bounds=True, partial=False, fill=True)

generate a ParameterEnsemble using a mixture of distributions. Available distributions include (log) “uniform”, (log) “triangular”, and (log) “gaussian”. log transformation is respected.

Parameters:
  • pst (pyemu.Pst) – a control file

  • how_dict (dict) – a dictionary of parameter name keys and “how” values, where “how” can be “uniform”,”triangular”, or “gaussian”.

  • default (str) – the default distribution to use for parameter not listed in how_dict. Default is “gaussian”.

  • num_reals (int) – number of realizations to draw. Default is 100.

  • cov (pyemu.Cov) – an optional Cov instance to use for drawing from gaussian distribution. If None, and “gaussian” is listed in how_dict (and/or default), then a diagonal covariance matrix is constructed from the parameter bounds in pst (with sigma_range). Default is None.

  • sigma_range (float) – the number of standard deviations implied by the parameter bounds in the pst. Only used if “gaussian” is in how_dict (and/or default) and cov is None. Default is 6.

  • enforce_bounds (bool) – flag to enforce parameter bounds in resulting ParameterEnsemble. Only matters if “gaussian” is in values of how_dict. Default is True.

  • partial (bool) – flag to allow a partial ensemble (not all pars included). If True, parameters not name in how_dict will be sampled using the distribution named as default. Default is False.

  • fill (bool) – flag to fill in fixed and/or tied parameters with control file values. Default is True.

Example:

pst = pyemu.Pst("pest.pst")
# uniform for the fist 10 pars
how_dict = {p:"uniform" for p in pst.adj_par_names[:10]}
pe = pyemu.ParameterEnsemble(pst,how_dict=how_dict)
pe.to_csv("my_mixed_pe.csv")
classmethod from_parfiles(pst, parfile_names, real_names=None)

create a parameter ensemble from PEST-style parameter value files. Accepts parfiles with less than the parameters in the control (get NaNs in the ensemble) or extra parameters in the parfiles (get dropped)

Parameters:
  • pst (pyemu.Pst) – control file instance

  • parfile_names ([str]) – par file names

  • real_names (str) – optional list of realization names. If None, a single integer counter is used

Returns:

parameter ensemble loaded from par files

Return type:

ParameterEnsemble

classmethod from_triangular_draw(pst, num_reals=100, fill=True)

generate a ParameterEnsemble from a (multivariate) (log) triangular distribution

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • num_reals (int, optional) – number of realizations to generate. Default is 100

  • fill (bool) – flag to fill in fixed and/or tied parameters with control file values. Default is True.

Returns:

a parameter ensemble drawn from the multivariate (log) triangular distribution defined by the parameter upper and lower bounds and initial parameter values in pst

Return type:

ParameterEnsemble

Note

respects transformation status in pst: fixed and tied parameters are not realized, log-transformed parameters are drawn in log space. The returned ParameterEnsemble is back transformed (not in log space)

uses numpy.random.triangular

Example:

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_triangular_draw(pst)
pe.to_csv("my_tri_pe.csv")
classmethod from_uniform_draw(pst, num_reals, fill=True)

generate a ParameterEnsemble from a (multivariate) (log) uniform distribution

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • num_reals (int, optional) – number of realizations to generate. Default is 100

  • fill (bool) – flag to fill in fixed and/or tied parameters with control file values. Default is True.

Returns:

a parameter ensemble drawn from the multivariate (log) uniform distribution defined by the parameter upper and lower bounds pst

Return type:

ParameterEnsemble

Note

respects transformation status in pst: fixed and tied parameters are not realized, log-transformed parameters are drawn in log space. The returned ParameterEnsemble is back transformed (not in log space)

uses numpy.random.uniform

Example:

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_uniform_draw(pst)
pe.to_csv("my_uni_pe.csv")
property lbnd

the lower bound vector while respecting current log transform status

Returns:

(log-transformed) lower parameter bounds listed in ParameterEnsemble.pst.parameter_data.parlbnd

Return type:

pandas.Series

property log_indexer

boolean indexer for log transform

Returns:

boolean array indicating which parameters are log transformed

Return type:

numpy.ndarray(bool)

project(projection_matrix, center_on=None, log=None, enforce_bounds='reset')

project the ensemble using the null-space Monte Carlo method

Parameters:
  • projection_matrix (pyemu.Matrix) – null-space projection operator.

  • center_on (str) – the name of the realization to use as the centering point for the null-space differening operation. If center_on is None, the ParameterEnsemble mean vector is used. Default is None

  • log (pyemu.Logger, optional) – for logging progress

  • enforce_bounds (str) – parameter bound enforcement option to pass to ParameterEnsemble.enforce(). Valid options are reset, drop, scale or None. Default is reset.

Returns:

untransformed, null-space projected ensemble.

Return type:

ParameterEnsemble

Example:

ev = pyemu.ErrVar(jco="my.jco") #assumes my.pst exists
pe = pyemu.ParameterEnsemble.from_gaussian_draw(ev.pst)
pe_proj = pe.project(ev.get_null_proj(maxsing=25))
pe_proj.to_csv("proj_par.csv")
transform()

transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

property ubnd

the upper bound vector while respecting current log transform status

Returns:

(log-transformed) upper parameter bounds listed in ParameterEnsemble.pst.parameter_data.parubnd

Return type:

pandas.Series

pyemu.ev module

class pyemu.ev.ErrVar(jco, **kwargs)

Bases: LinearAnalysis

FOSM-based error variance analysis

Parameters:
  • jco (varies, optional) – something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.

  • pst (varies, optional) – something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)

  • parcov (varies, optional) – prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance

  • obscov (varies, optional) – observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the noise covariance matrix is constructed from the obsevation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance

  • forecasts (varies, optional) – forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection. Note if the PEST++ option “++forecasts()” is set in the pest control file (under the pyemu.Pst.pestpp_options dictionary), then there is no need to pass this argument (unless you want to analyze different forecasts) of pyemu.Matrix or numpy.ndarray.

  • ref_var (float, optional) – reference variance. Default is 1.0

  • verbose (bool) – controls screen output. If str, a filename is assumed and and log file is written.

  • sigma_range (float, optional) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.

  • scale_offset (bool, optional) – flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.

  • omitted_parameters ([str]) – list of parameters to treat as “omitted”. Passing this argument activates 3-term error variance analysis.

  • omitted_parcov (varies) – an argument that can be cast to a parcov for the omitted parameters. If None, omitted_parcov will be formed by extracting a sub-matrix from the LinearAnalsis.parcov attribute.

  • omitted_predictions (varies) – an argument that can be cast to a “predictions” (e.g. “forecasts”) attribute to form prediction sensitivity vectors with respec to the omitted parameters. If None, these vectors will be extracted from the pyemu.LinearAnalysis.predictions attribute

  • kl (bool, optional) – flag to perform Karhunen-Loeve scaling on the jacobian before error variance calculations. If True, the pyemu.ErrVar.jco and pyemu.ErrVar.parcov are altered in place. Default is False.

Example:

ev = pyemu.ErrVar(jco="my.jco",omitted_parameters=["wel1","wel2"])
df = ev.get_errvar_dataframe()
G(singular_value)

get the parameter solution Matrix at a given singular value

Parameters:

singular_value (int) – singular value to calc G at

Returns:

parameter solution matrix (V_1 * S_1^(_1) * U_1^T) at singular_value

Return type:

pyemu.Matrix

I_minus_R(singular_value)

get I - R at a given singular value

Parameters:

singular_value (int) – singular value to calculate I - R at

Returns:

identity matrix minus resolution matrix at singular_value

Return type:

pyemu.Matrix

R(singular_value)

get resolution Matrix (V_1 * V_1^T) at a given singular value

Args: singular_value (int): singular value to calculate R at

Returns:

resolution matrix at singular_value

Return type:

pyemu.Matrix

first_forecast(singular_value)
get the null space term (first term) contribution to forecast (e.g. prediction)

error variance at a given singular value.

Parameters:

singular_value (int) – singular value to calc first term at

Note

This method is used to construct the error variance dataframe

Just a wrapper around ErrVar.first_forecast

Returns:

dictionary of (“first”,prediction_names),error variance pairs at singular_value

Return type:

dict

first_parameter(singular_value)
get the null space term (first term) contribution to parameter error variance

at a given singular value

Parameters:

singular_value (int) – singular value to calc first term at

Returns:

first term contribution to parameter error variance

Return type:

pyemu.Cov

first_prediction(singular_value)
get the null space term (first term) contribution to prediction error variance

at a given singular value.

Parameters:

singular_value (int) – singular value to calc first term at

Note

This method is used to construct the error variance dataframe

Returns:

dictionary of (“first”,prediction_names),error variance pairs at singular_value

Return type:

dict

get_errvar_dataframe(singular_values=None)

primary entry point for error variance analysis.

Parameters:

singular_values ([int], optional) – a list singular values to test. If None, defaults to range(0,min(nnz_obs,nadj_par) + 1).

Returns:

a multi-indexed pandas dataframe summarizing each of the error variance terms for each nominated forecast. Rows are the singluar values tested, columns are a multi-index of forecast name and error variance term number (e.g. 1,2 or (optionally) 3).

Return type:

pandas.DataFrame

Example:

ev = pyemu.ErrVar(jco="my.jco",omitted_parameters=["wel1","wel2"])
df = ev.get_errvar_dataframe()
get_identifiability_dataframe(singular_value=None, precondition=False)

primary entry point for identifiability analysis

Parameters:
  • singular_value (int) – the singular spectrum truncation point. Defaults to minimum of non-zero-weighted observations and adjustable parameters

  • precondition (bool) – flag to use the preconditioned hessian with the prior parameter covariance matrix (xtqt + sigma_theta^-1). This should be used KL scaling. Default is False.

Returns:

A pandas dataframe of the right solution-space singular vectors and identifiability (identifiabiity is in the column labeled “ident”)

Return type:

pandas.DataFrame

Examples:

ev = pyemu.ErrVar(jco="my.jco")
df = ev.get_identifiability_dataframe(singular_value=20)
df.ident.plot(kind="bar")
get_null_proj(maxsing=None, eigthresh=1e-06)

get a null-space projection matrix of XTQX

Parameters:
  • maxsing (int, optional) – number of singular components to use (the truncation point). If None, pyemu.Matrx.get_maxsing() is used to determine the truncation point with `eigthresh. Default is None

  • eigthresh (float, optional) – the ratio of smallest to largest singular value to keep in the range (solution) space of XtQX. Not used if maxsing is not None. Default is 1.0e-6

Note

used for null-space monte carlo operations.

Returns:

pyemu.Matrix the null-space projection matrix (V2V2^T)

property omitted_jco

the omitted-parameters jacobian matrix

Returns:

the jacobian matrix instance of non-zero-weighted observations and omitted parameters

Return type:

pyemu.Jco

property omitted_parcov

the prior omitted-parameter covariance matrix

Returns:

the prior parameter covariance matrix of the omitted parameters

Return type:

pyemu.Cov

property omitted_predictions

omitted prediction sensitivity vectors

Returns:

a matrix of prediction sensitivity vectors (column wise) to omitted parameters

Return type:

pyemu.Matrix

second_forecast(singular_value)

get the solution space contribution to forecast (e.g. “prediction”) error variance at a given singular value

Parameters:

singular_value (int) – singular value to calc second term at

Note

This method is used to construct error variance dataframe

Just a thin wrapper around ErrVar.second_prediction

Returns:

dictionary of (“second”,prediction_names), error variance arising from the solution space contribution (y^t * G * obscov * G^T * y)

Return type:

dict

second_parameter(singular_value)
get the solution space contribution to parameter error variance

at a given singular value (G * obscov * G^T)

Parameters:

singular_value (int) – singular value to calc second term at

Returns:

the second term contribution to parameter error variance (G * obscov * G^T)

Return type:

pyemu.Cov

second_prediction(singular_value)

get the solution space contribution to predictive error variance at a given singular value

Parameters:

singular_value (int) – singular value to calc second term at

Note

This method is used to construct error variance dataframe

Returns: dict: dictionary of (“second”,prediction_names), error variance

arising from the solution space contribution (y^t * G * obscov * G^T * y)

third_forecast(singular_value)
get the omitted parameter contribution to forecast (prediction) error variance

at a given singular value.

Parameters:

singular_value (int) – singular value to calc third term at

Note

used to construct error variance dataframe just a thin wrapper around ErrVar.third_prediction()

Returns:

a dictionary of (“third”,prediction_names),error variance

Return type:

dict

third_parameter(singular_value)
get the omitted parameter contribution to parameter error variance

at a given singular value

Parameters:

singular_value (int) – singular value to calc third term at

Returns:

the third term contribution to parameter error variance calculated at singular_value (G * omitted_jco * Sigma_(omitted_pars) * omitted_jco^T * G^T). Returns 0.0 if third term calculations are not being used.

Return type:

pyemu.Cov

third_prediction(singular_value)
get the omitted parameter contribution to prediction error variance

at a given singular value.

Parameters:

singular_value (int) – singular value to calc third term at

Note

used to construct error variance dataframe

Returns:

a dictionary of (“third”,prediction_names),error variance

Return type:

dict

variance_at(singular_value)
get the error variance of all three error variance terms at a

given singluar value

Parameters:

singular_value (int) – singular value to test

Returns:

dictionary of (err var term,prediction_name), variance pairs

Return type:

dict

pyemu.la module

LinearAnalysis object, which is the base class for other pyemu analysis objects (Schur, ErrVar, MonteCarlo, and EnsembleSmoother). This class is usually not used directly.

class pyemu.la.LinearAnalysis(jco=None, pst=None, parcov=None, obscov=None, predictions=None, ref_var=1.0, verbose=False, resfile=False, forecasts=None, sigma_range=4.0, scale_offset=True, **kwargs)

Bases: object

The base/parent class for linear analysis.

Parameters:
  • jco (varies, optional) – something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.

  • pst (varies, optional) – something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)

  • parcov (varies, optional) – prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance

  • obscov (varies, optional) – observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the noise covariance matrix is constructed from the obsevation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance

  • forecasts (varies, optional) – forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection of pyemu.Matrix or numpy.ndarray.

  • ref_var (float, optional) – reference variance. Default is 1.0

  • verbose (bool) – controls screen output. If str, a filename is assumed and and log file is written.

  • sigma_range (float, optional) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.

  • scale_offset (bool, optional) – flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.

Note

Can be used directly, but for prior uncertainty analyses only.

The derived types (pyemu.Schur, pyemu.ErrVar) are for different forms of FOMS-based posterior uncertainty analyses.

This class tries hard to not load items until they are needed; all arguments are optional.

The class makes heavy use of property decorator to encapsulated private attributes

Example:

#assumes "my.pst" exists
la = pyemu.LinearAnalysis(jco="my.jco",forecasts=["fore1","fore2"])
print(la.prior_forecast)
property adj_par_names

adjustable parameter names

Returns:

list of adjustable parameter names

Return type:

[‘str`]

Note

if LinearAnalysis.pst is None, returns LinearAnalysis.jco.col_names

adjust_obscov_resfile(resfile=None)

reset the elements of obscov by scaling the implied weights based on the phi components in res_file so that the total phi is equal to the number of non-zero weights.

Parameters:

resfile (str) – residual file to use. If None, residual file with case name is sought. default is None

Note

calls pyemu.Pst.adjust_weights_resfile()

apply_karhunen_loeve_scaling()

apply karhuene-loeve scaling to the jacobian matrix.

Note

This scaling is not necessary for analyses using Schur’s complement, but can be very important for error variance analyses. This operation effectively transfers prior knowledge specified in the parcov to the jacobian and reset parcov to the identity matrix.

clean()

drop regularization and prior information observation from the jco

drop_prior_information()

drop the prior information from the jco and pst attributes

property fehalf

Karhunen-Loeve scaling matrix attribute.

Returns:

the Karhunen-Loeve scaling matrix based on the prior parameter covariance matrix

Return type:

pyemu.Matrix

property forecast_names

get the forecast (aka prediction) names

Returns:

list of forecast names

Return type:

([str])

property forecasts

the forecast sentivity vectors attribute

Returns:

a matrix of forecast (prediction) sensitivity vectors (column wise)

Return type:

pyemu.Matrix

property forecasts_iter

forecast (e.g. prediction) sensitivity vectors iterator

Returns:

iterator on forecasts (e.g. predictions) sensitivity vectors (matrix)

Return type:

iterator

Note

This is used for processing huge numbers of predictions

This is a synonym for LinearAnalysis.predictions_iter()

get(par_names=None, obs_names=None, astype=None)

method to get a new LinearAnalysis class using a subset of parameters and/or observations

Parameters:
  • par_names ([‘str]) – par names for new object

  • obs_names ([‘str]) – obs names for new object

  • astype (pyemu.Schur or pyemu.ErrVar) – type to cast the new object. If None, return type is same as self

Returns:

new instance

Return type:

LinearAnalysis

get_cso_dataframe()

get a dataframe of composite observation sensitivity, as returned by PEST in the seo file.

Returns:

dataframe of observation names and composite observation sensitivity

Return type:

pandas.DataFrame

Note

That this formulation deviates slightly from the PEST documentation in that the values are divided by (npar-1) rather than by (npar).

The equation is cso_j = ((Q^1/2*J*J^T*Q^1/2)^1/2)_jj/(NPAR-1)

get_obs_competition_dataframe()

get the observation competition stat a la PEST utility

Returns:

a dataframe of observation names by observation names with values equal to the PEST competition statistic

Return type:

pandas.DataFrame

get_par_css_dataframe()

get a dataframe of composite scaled sensitivities. Includes both PEST-style and Hill-style.

Returns:

a dataframe of parameter names, PEST-style and Hill-style composite scaled sensitivity

Return type:

pandas.DataFrame

property jco

the jacobian matrix attribute

Returns:

the jacobian matrix attribute

Return type:

pyemu.Jco

property mle_covariance

maximum likelihood parameter covariance matrix.

Returns:

maximum likelihood parameter covariance matrix

Return type:

pyemu.Matrix

property mle_parameter_estimate

maximum likelihood parameter estimate.

Returns:

the maximum likelihood parameter estimates

Return type:

pandas.Series

property nnz_obs_names

non-zero-weighted observation names

Returns:

list of non-zero-weighted observation names

Return type:

[‘str`]

Note

if LinearAnalysis.pst is None, returns LinearAnalysis.jco.row_names

property obscov

get the observation noise covariance matrix attribute

Returns:

a reference to the LinearAnalysis.obscov attribute

Return type:

pyemu.Cov

property parcov

get the prior parameter covariance matrix attribute

Returns:

a reference to the LinearAnalysis.parcov attribute

Return type:

pyemu.Cov

property predictions

the prediction (aka forecast) sentivity vectors attribute

Returns:

a matrix of prediction sensitivity vectors (column wise)

Return type:

pyemu.Matrix

property predictions_iter

prediction sensitivity vectors iterator

Returns:

iterator on prediction sensitivity vectors (matrix)

Return type:

iterator

Note

this is used for processing huge numbers of predictions

property prior_forecast

prior forecast (e.g. prediction) variances

Returns:

a dictionary of forecast name, prior variance pairs

Return type:

dict

property prior_parameter

prior parameter covariance matrix

Returns:

prior parameter covariance matrix

Return type:

pyemu.Cov

property prior_prediction

prior prediction (e.g. forecast) variances

Returns:

a dictionary of prediction name, prior variance pairs

Return type:

dict

property pst

the pst attribute

Returns:

the pst attribute

Return type:

pyemu.Pst

property qhalf

square root of the cofactor matrix attribute. Create the attribute if it has not yet been created

Returns:

square root of the cofactor matrix

Return type:

pyemu.Matrix

property qhalfx

half normal matrix attribute.

Returns:

half normal matrix attribute

Return type:

pyemu.Matrix

reset_obscov(arg=None)

reset the obscov attribute to None

Parameters:

arg (str or pyemu.Matrix) – the value to assign to the obscov attribute. If None, the private __obscov attribute is cleared but not reset

reset_parcov(arg=None)

reset the parcov attribute to None

Parameters:

arg (str or pyemu.Matrix) – the value to assign to the parcov attribute. If None, the private __parcov attribute is cleared but not reset

reset_pst(arg)

reset the LinearAnalysis.pst attribute

Parameters:

arg (str or pyemu.Pst) – the value to assign to the pst attribute

property xtqx

normal matrix attribute.

Returns:

normal matrix attribute

Return type:

pyemu.Matrix

pyemu.logger module

module for logging pyemu progress

class pyemu.logger.Logger(filename, echo=False)

Bases: object

a basic class for logging events during the linear analysis calculations

if filename is passed, then a file handle is opened.

Parameters:
  • filename (str) – Filename to write logged events to. If False, no file will be created, and logged events will be displayed on standard out.

  • echo (bool) – Flag to cause logged events to be echoed to the screen.

log(phrase)

log something that happened.

Arg:

phrase (str): statement to log

Notes

The first time phrase is passed the start time is saved.

The second time the phrase is logged, the elapsed time is written

lraise(message)

log an exception, close the log file, then raise the exception

Arg:

phrase (str): exception statement to log and raise

statement(phrase)

log a one-time statement

Arg:

phrase (str): statement to log

warn(message)

write a warning to the log file.

Arg:

phrase (str): warning statement to log

pyemu.mc module

pyEMU Monte Carlo module. Supports easy Monte Carlo and GLUE analyses. The MonteCarlo class inherits from pyemu.LinearAnalysis

class pyemu.mc.MonteCarlo(**kwargs)

Bases: LinearAnalysis

LinearAnalysis derived type for monte carlo analysis

Parameters:

**kwargs (dict) – dictionary of keyword arguments. See pyemu.LinearAnalysis for complete definitions

parensemble

pyemu object derived from a pandas dataframe, the ensemble of parameters from the PEST control file with associated starting value and bounds. Object also exposes methods relevant to the dataframe and parameters– see documentation.

Type:

pyemu.ParameterEnsemble

obsensemble

pyemu object derived from a pandas dataframe, the ensemble of observations from the PEST control file with associated starting weights. Object also exposes methods relevant to the dataframe and observations– see documentation.

Type:

pyemu.ObservationEnsemble

Returns:

pyEMU MonteCarlo object

Return type:

MonteCarlo

Example

>>>import pyemu

>>>mc = pyemu.MonteCarlo(pst="pest.pst")

draw(num_reals=1, par_file=None, obs=False, enforce_bounds=None, cov=None, how='gaussian')
draw stochastic realizations of parameters and

optionally observations, filling MonteCarlo.parensemble and optionally MonteCarlo.obsensemble.

Parameters:
  • num_reals (int) – number of realization to generate

  • par_file (str) – parameter file to use as mean values. If None, use MonteCarlo.pst.parameter_data.parval1. Default is None

  • obs (bool) – add a realization of measurement noise to observation values, forming MonteCarlo.obsensemble.Default is False

  • enforce_bounds (str) – enforce parameter bounds based on control file information. options are ‘reset’, ‘drop’ or None. Default is None

  • how (str) – type of distribution to draw from. Must be in [“gaussian”,”uniform”] default is “gaussian”.

Example

>>>import pyemu

>>>mc = pyemu.MonteCarlo(pst="pest.pst")

>>>mc.draw(1000)

get_nsing(epsilon=0.0001)

get the number of solution space dimensions given a ratio between the largest and smallest singular values

Parameters:

epsilon (float) – singular value ratio

Returns:

nsing – number of singular components above the epsilon ratio threshold

Return type:

float

Note

If nsing == nadj_par, then None is returned

get_null_proj(nsing=None)

get a null-space projection matrix of XTQX

Parameters:

nsing (int) – optional number of singular components to use If Nonte, then nsing is determined from call to MonteCarlo.get_nsing()

Returns:

v2_proj – the null-space projection matrix (V2V2^T)

Return type:

pyemu.Matrix

property num_reals

get the number of realizations in the parameter ensemble

Returns:

num_real

Return type:

int

project_parensemble(par_file=None, nsing=None, inplace=True, enforce_bounds='reset')

perform the null-space projection operations for null-space monte carlo

Parameters:
  • par_file (str) – an optional file of parameter values to use

  • nsing (int) – number of singular values to in forming null subspace matrix

  • inplace (bool) – overwrite the existing parameter ensemble with the projected values

  • enforce_bounds (str) – how to enforce parameter bounds. can be None, ‘reset’, or ‘drop’. Default is None

Returns:

par_en – if inplace is False, otherwise None

Return type:

pyemu.ParameterEnsemble

Note

to use this method, the MonteCarlo instance must have been constructed with the jco argument.

Example

>>>import pyemu

>>>mc = pyemu.MonteCarlo(jco="pest.jcb")

>>>mc.draw(1000)

>>>mc.project_parensemble(par_file="final.par",nsing=100)

write_psts(prefix, existing_jco=None, noptmax=None)
write parameter and optionally observation realizations

to a series of pest control files

Parameters:
  • prefix (str) – pest control file prefix

  • existing_jco (str) – filename of an existing jacobian matrix to add to the pest++ options in the control file. This is useful for NSMC since this jco can be used to get the first set of parameter upgrades for free! Needs to be the path the jco file as seen from the location where pest++ will be run

  • noptmax (int) – value of NOPTMAX to set in new pest control files

Example

>>>import pyemu

>>>mc = pyemu.MonteCarlo(jco="pest.jcb")

>>>mc.draw(1000, obs=True)

>>>mc.write_psts("mc_", existing_jco="pest.jcb", noptmax=1)

pyemu.pyemu_warnings module

exception pyemu.pyemu_warnings.PyemuWarning

Bases: Warning

pyemu.pyemu_warnings.warning_on_one_line(message, category, filename, lineno, file=None, line=None)

pyemu.sc module

module for FOSM-based uncertainty analysis using a linearized form of Bayes equation known as the Schur compliment

class pyemu.sc.Schur(jco, **kwargs)

Bases: LinearAnalysis

FOSM-based uncertainty and data-worth analysis

Parameters:
  • jco (varies, optional) – something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.

  • pst (varies, optional) – something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)

  • parcov (varies, optional) – prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance

  • obscov (varies, optional) – observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the noise covariance matrix is constructed from the obsevation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance

  • forecasts (varies, optional) – forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection. Note if the PEST++ option “++forecasts()” is set in the pest control file (under the pyemu.Pst.pestpp_options dictionary), then there is no need to pass this argument (unless you want to analyze different forecasts) of pyemu.Matrix or numpy.ndarray.

  • ref_var (float, optional) – reference variance. Default is 1.0

  • verbose (bool) – controls screen output. If str, a filename is assumed and and log file is written.

  • sigma_range (float, optional) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.

  • scale_offset (bool, optional) – flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.

Note

This class is the primary entry point for FOSM-based uncertainty and dataworth analyses

This class replicates and extends the behavior of the PEST PREDUNC utilities.

Example:

#assumes "my.pst" exists
sc = pyemu.Schur(jco="my.jco",forecasts=["fore1","fore2"])
print(sc.get_forecast_summary())
print(sc.get_parameter_contribution())
get_added_obs_group_importance(reset_zero_weight=1.0)
A dataworth method to analyze the posterior uncertainty as a result of gaining

existing observations, tested by observation groups

Parameters:

reset_zero_weight (float, optional) – a flag to reset observations with zero weight in obslist_dict If reset_zero_weights passed as 0.0, no weights adjustments are made. Default is 1.0.

Returns:

A dataframe with index of observation group names and columns of forecast names. The values in the dataframe are the posterior variances of the forecasts resulting from gaining the information contained in each observation group. One row in the dataframe is labeled “base” - this is posterior forecast variance resulting from the notional calibration with the non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should either not change or decrease as a result of gaining new observations. The magnitude of the decrease represents the worth of the potential new observation(s) in each group being tested.

Return type:

pandas.DataFrame

Note

Observations in Schur.pst with zero weights are not included in the analysis unless reset_zero_weight is a float greater than zero. In most cases, users will want to reset zero-weighted observations as part dataworth testing process.

Example:

sc = pyemu.Schur("my.jco")
df = sc.get_added_obs_group_importance(reset_zero_weight=1.0)
get_added_obs_importance(obslist_dict=None, base_obslist=None, reset_zero_weight=1.0)
A dataworth method to analyze the posterior uncertainty as a result of gathering

some additional observations

Parameters:
  • obslist_dict (dict, optional) – a nested dictionary-list of groups of observations that are to be treated as gained/collected. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None

  • base_obslist ([str], optional) – observation names to treat as the “existing” observations. The values of obslist_dict will be added to this list during each test. If None, then the values in each obslist_dict entry will be treated as the entire calibration dataset. That is, there are no existing observations. Default is None. Standard practice would be to pass this argument as pyemu.Schur.pst.nnz_obs_names so that existing, non-zero-weighted observations are accounted for in evaluating the worth of new yet-to-be-collected observations.

  • reset_zero_weight (float, optional) – a flag to reset observations with zero weight in obslist_dict If reset_zero_weights passed as 0.0, no weights adjustments are made. Default is 1.0.

Returns:

a dataframe with row labels (index) of obslist_dict.keys() and columns of forecast names. The values in the dataframe are the posterior variance of the forecasts resulting from notional inversion using the observations in obslist_dict[key value] plus the observations in base_obslist (if any). One row in the dataframe is labeled “base” - this is posterior forecast variance resulting from the notional calibration with the observations in base_obslist (if base_obslist is None, then the “base” row is the prior forecast variance). Conceptually, the forecast variance should either not change or decrease as a result of gaining additional observations. The magnitude of the decrease represents the worth of the potential new observation(s) being tested.

Return type:

pandas.DataFrame

Note

Observations listed in base_obslist is required to only include observations with weight not equal to zero. If zero-weighted observations are in base_obslist an exception will be thrown. In most cases, users will want to reset zero-weighted observations as part dataworth testing process. If reset_zero_weights == 0, no weights adjustments will be made - this is most appropriate if different weights are assigned to the added observation values in Schur.pst

Example:

sc = pyemu.Schur("my.jco")
obslist_dict = {"hds":["head1","head2"],"flux":["flux1","flux2"]}
df = sc.get_added_obs_importance(obslist_dict=obslist_dict,
                                 base_obslist=sc.pst.nnz_obs_names,
                                 reset_zero_weight=1.0)
get_conditional_instance(parameter_names)

get a new pyemu.Schur instance that includes conditional update from some parameters becoming known perfectly

Parameters:

parameter_names ([str]) – list of parameters that are to be treated as notionally perfectly known

Returns:

a new Schur instance conditional on perfect knowledge of some parameters. The new instance has an updated parcov that is less the names listed in parameter_names.

Return type:

pyemu.Schur

Note

This method is primarily for use by the LinearAnalysis.get_parameter_contribution() dataworth method.

get_forecast_summary()

summary of the FOSM-based forecast uncertainty (variance) estimate(s)

Returns:

dataframe of prior,posterior variances and percent uncertainty reduction of each forecast (e.g. prediction)

Return type:

pandas.DataFrame

Note

This is the primary entry point for accessing forecast uncertainty estimates “precent_reduction” column in dataframe is calculated as 100.0 * (1.0 - (posterior variance / prior variance)

Example:

sc = pyemu.Schur(jco="my.jcb",forecasts=["fore1","fore2"])
df = sc.get_parameter_summary()
df.loc[:,["prior","posterior"]].plot(kind="bar")
plt.show()
df.percent_reduction.plot(kind="bar")
plt.show()
get_obs_group_dict()

get a dictionary of observations grouped by observation group name

Returns:

a dictionary of observations grouped by observation group name. Useful for dataworth processing in pyemu.Schur

Return type:

dict

Note

only includes observations that are listed in Schur.jco.row_names

Example:

sc = pyemu.Schur("my.jco")
obsgrp_dict = sc.get_obs_group_dict()
df = sc.get_removed_obs_importance(obsgrp_dict=obsgrp_dict)
get_par_contribution(parlist_dict=None, include_prior_results=False)

A dataworth method to get a dataframe the prior and posterior uncertainty reduction as a result of some parameter becoming perfectly known

Parameters:
  • parlist_dict – (dict, optional): a nested dictionary-list of groups of parameters that are to be treated as perfectly known. key values become row labels in returned dataframe. If None, each adjustable parameter is sequentially treated as known and the returned dataframe has row labels for each adjustable parameter

  • include_prior_results (bool, optional) – flag to return a multi-indexed dataframe with both conditional prior and posterior forecast uncertainty estimates. This is because the notional learning about parameters potentially effects both the prior and posterior forecast uncertainty estimates. If False, only posterior results are returned. Default is False

Returns:

a dataframe that summarizes the parameter contribution dataworth analysis. The dataframe has index (row labels) of the keys in parlist_dict and a column labels of forecast names. The values in the dataframe are the posterior variance of the forecast conditional on perfect knowledge of the parameters in the values of parlist_dict. One row in the dataframe will be labeled base - this is the forecast uncertainty estimates that include the effects of all adjustable parameters. Percent decreases in forecast uncertainty can be calculated by differencing all rows against the “base” row. Varies depending on include_prior_results.

Return type:

pandas.DataFrame

Note

This is the primary dataworth method for assessing the contribution of one or more parameters to forecast uncertainty.

Example:

sc = pyemu.Schur(jco="my.jco")
parlist_dict = {"hk":["hk1","hk2"],"rech"["rech1","rech2"]}
df = sc.get_par_contribution(parlist_dict=parlist_dict)
get_par_group_contribution(include_prior_results=False)

A dataworth method to get the forecast uncertainty contribution from each parameter group

Parameters:

include_prior_results (bool, optional) – flag to return a multi-indexed dataframe with both conditional prior and posterior forecast uncertainty estimates. This is because the notional learning about parameters potentially effects both the prior and posterior forecast uncertainty estimates. If False, only posterior results are returned. Default is False

Returns:

a dataframe that summarizes the parameter contribution analysis. The dataframe has index (row labels) that are the parameter group names and a column labels of forecast names. The values in the dataframe are the posterior variance of the forecast conditional on perfect knowledge of the adjustable parameters in each parameter group. One row is labelled “base” - this is the variance of the forecasts that includes the effects of all adjustable parameters. Varies depending on include_prior_results.

Return type:

pandas.DataFrame

Note

This method is just a thin wrapper around get_contribution_dataframe() - this method automatically constructs the parlist_dict argument where the keys are the group names and the values are the adjustable parameters in the groups

Example:

sc = pyemu.Schur(jco="my.jco")
df = sc.get_par_group_contribution()
get_parameter_summary()

summary of the FOSM-based parameter uncertainty (variance) estimate(s)

Returns:

dataframe of prior,posterior variances and percent uncertainty reduction of each parameter

Return type:

pandas.DataFrame

Note

This is the primary entry point for accessing parameter uncertainty estimates

The “Prior” column in dataframe is the diagonal of LinearAnalysis.parcov “precent_reduction” column in dataframe is calculated as 100.0 * (1.0 - (posterior variance / prior variance)

Example:

sc = pyemu.Schur(jco="my.jcb",forecasts=["fore1","fore2"])
df = sc.get_parameter_summary()
df.loc[:,["prior","posterior"]].plot(kind="bar")
plt.show()
df.percent_reduction.plot(kind="bar")
plt.show()
get_removed_obs_group_importance(reset_zero_weight=None)
A dataworth method to analyze the posterior uncertainty as a result of losing

existing observations, tested by observation groups

Parameters:

DEPRECATED (reset_zero_weight) –

Returns:

A dataframe with index of observation group names and columns of forecast names. The values in the dataframe are the posterior variances of the forecasts resulting from losing the information contained in each observation group. One row in the dataframe is labeled “base” - this is posterior forecast variance resulting from the notional calibration with the non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should either not change or increase as a result of losing existing observations. The magnitude of the increase represents the worth of the existing observation(s) in each group being tested.

Return type:

pandas.DataFrame

Example:

sc = pyemu.Schur("my.jco")
df = sc.get_removed_obs_group_importance()
get_removed_obs_importance(obslist_dict=None, reset_zero_weight=None)
A dataworth method to analyze the posterior uncertainty as a result of losing

some existing observations

Parameters:
  • obslist_dict (dict, optional) – a nested dictionary-list of groups of observations that are to be treated as lost. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None

  • DEPRECATED (reset_zero_weight) –

Returns:

A dataframe with index of obslist_dict.keys() and columns of forecast names. The values in the dataframe are the posterior variances of the forecasts resulting from losing the information contained in obslist_dict[key value]. One row in the dataframe is labeled “base” - this is posterior forecast variance resulting from the notional calibration with the non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should either not change or increase as a result of losing existing observations. The magnitude of the increase represents the worth of the existing observation(s) being tested.

Note: All observations that may be evaluated as removed must have non-zero weight

Return type:

pandas.DataFrame

Example:

sc = pyemu.Schur("my.jco")
df = sc.get_removed_obs_importance()
next_most_important_added_obs(forecast=None, niter=3, obslist_dict=None, base_obslist=None, reset_zero_weight=1.0)

find the most important observation(s) by sequentially evaluating the importance of the observations in obslist_dict.

Parameters:
  • forecast (str, optional) – name of the forecast to use in the ranking process. If more than one forecast has been listed, this argument is required. This is because the data worth must be ranked with respect to the variance reduction for a single forecast

  • niter (int, optional) – number of sequential dataworth testing iterations. Default is 3

  • obslist_dict (dict, optional) – a nested dictionary-list of groups of observations that are to be treated as gained/collected. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None

  • base_obslist ([str], optional) – observation names to treat as the “existing” observations. The values of obslist_dict will be added to this list during each test. If None, then the values in each obslist_dict entry will be treated as the entire calibration dataset. That is, there are no existing observations. Default is None. Standard practice would be to pass this argument as pyemu.Schur.pst.nnz_obs_names so that existing, non-zero-weighted observations are accounted for in evaluating the worth of new yet-to-be-collected observations.

  • reset_zero_weight (float, optional) – a flag to reset observations with zero weight in obslist_dict If reset_zero_weights passed as 0.0, no weights adjustments are made. Default is 1.0.

Returns:

a dataFrame with columns of obslist_dict key for each iteration the yields the largest variance reduction for the named forecast. Columns are forecast variance percent reduction for each iteration (percent reduction compared to initial “base” case with all non-zero weighted observations included in the notional calibration)

Return type:

pandas.DataFrame

Note

The most important observations from each iteration is added to base_obslist and removed obslist_dict for the next iteration. In this way, the added observation importance values include the conditional information from the last iteration.

Example:

sc = pyemu.Schur(jco="my.jco")
df = sc.next_most_important_added_obs(forecast="fore1",base_obslist=sc.pst.nnz_obs_names)
next_most_par_contribution(niter=3, forecast=None, parlist_dict=None)

find the parameter(s) contributing most to posterior forecast by sequentially evaluating the contribution of parameters in parlist_dict.

Parameters:
  • forecast (str, optional) – name of the forecast to use in the ranking process. If more than one forecast has been listed, this argument is required. This is because the data worth must be ranked with respect to the variance reduction for a single forecast

  • niter (int, optional) – number of sequential dataworth testing iterations. Default is 3

  • parlist_dict (dict, optional) – dict a nested dictionary-list of groups of parameters that are to be treated as perfectly known. key values become row labels in dataframe

  • parlist_dict – a nested dictionary-list of groups of parameters that are to be treated as perfectly known (zero uncertainty). key values become row labels in returned dataframe. If None, then every adustable parameter is tested sequentially. Default is None. Conceptually, the forecast variance should either not change or decrease as a result of knowing parameter perfectly. The magnitude of the decrease represents the worth of gathering information about the parameter(s) being tested.

Note

The largest contributing parameters from each iteration are treated as known perfectly for the remaining iterations. In this way, the next iteration seeks the next most influential group of parameters.

Returns:

a dataframe with index of iteration number and columns of parlist_dict.keys(). The values are the results of the knowing each parlist_dict entry expressed as posterior variance reduction

Return type:

pandas.DataFrame

property posterior_forecast

posterior forecast (e.g. prediction) variance(s)

Returns:

dictionary of forecast names and FOSM-estimated posterior variances

Return type:

dict

Note

Sames as LinearAnalysis.posterior_prediction

See Schur.get_forecast_summary() for a dataframe-based container of prior and posterior variances

property posterior_parameter

posterior parameter covariance matrix.

Returns:

the posterior parameter covariance matrix

Return type:

pyemu.Cov

Example:

sc = pyemu.Schur(jco="my.jcb")
post_cov = sc.posterior_parameter
post_cov.to_ascii("post.cov")
property posterior_prediction

posterior prediction (e.g. forecast) variance estimate(s)

Returns:

dictionary of forecast names and FOSM-estimated posterior variances

Return type:

dict

Note:

sames as LinearAnalysis.posterior_forecast

See Schur.get_forecast_summary() for a dataframe-based container of prior and posterior variances

Module contents

pyEMU: python modules for Environmental Model Uncertainty analyses. These modules are designed to work directly and seamlessly with PEST and PEST++ model independent interface. pyEMU can also be used to setup this interface.

Several forms of uncertainty analyses are support including FOSM-based analyses (pyemu.Schur and pyemu.ErrVar), data worth analyses and high-dimensional ensemble generation.

class pyemu.Cov(x=None, names=[], row_names=[], col_names=[], isdiagonal=False, autoalign=True)

Bases: Matrix

Diagonal and/or dense Covariance matrices

Parameters:
  • x (numpy.ndarray) – numeric values

  • names ([str]) – list of row and column names

  • isdigonal (bool) – flag if the Matrix is diagonal

  • autoalign (bool) – flag to control the autoalignment of Matrix during linear algebra operations

Example:

data = np.random.random((10,10))
names = ["par_{0}".format(i) for i in range(10)]
mat = pyemu.Cov(x=data,names=names)
mat.to_binary("mat.jco")

Note

row_names and col_names args are supported in the contructor so support inheritance. However, users should only pass names

condition_on(conditioning_elements)

get a new Covariance object that is conditional on knowing some elements. uses Schur’s complement for conditional Covariance propagation

Parameters:

conditioning_elements (['str']) – list of names of elements to condition on

Returns:

new conditional Cov that assumes conditioning_elements have become known

Return type:

Cov

Example:

prior_cov = pyemu.Cov.from_parameter_data(pst)
now_known_pars = pst.adj_par_names[:5]
post_cov = prior_cov.condition_on(now_known_pars)
classmethod from_observation_data(pst)

instantiates a Cov from pyemu.Pst.observation_data

Parameters:

pst (pyemu.Pst) – control file instance

Returns:

a diagonal observation noise covariance matrix derived from the weights in the pest control file. Zero-weighted observations are included with a weight of 1.0e-30

Return type:

Cov

Example:

obscov = pyemu.Cov.from_observation_data(pst)
classmethod from_obsweights(pst_file)

instantiates a Cov instance from observation weights in a PEST control file.

Parameters:

pst_file (str) – pest control file name

Returns:

a diagonal observation noise covariance matrix derived from the weights in the pest control file. Zero-weighted observations are included with a weight of 1.0e-30

Return type:

Cov

Note

Calls Cov.from_observation_data()

Example:

obscov = pyemu.Cov.from_obsweights("my.pst")
classmethod from_parameter_data(pst, sigma_range=4.0, scale_offset=True, subset=None)

Instantiates a Cov from a pest control file parameter data section using parameter bounds as a proxy for uncertainty.

Parameters:
  • pst_file (str) – pest control file name

  • sigma_range (float) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution

  • scale_offset (bool) – flag to apply scale and offset to parameter upper and lower bounds before calculating varaince. In some cases, not applying scale and offset can result in undefined (log) variance. Default is True.

  • subset (list-like, optional) – Subset of parameters to draw

Returns:

diagonal parameter Cov matrix created from parameter bounds

Return type:

Cov

Note

Calls Cov.from_parameter_data()

classmethod from_parbounds(pst_file, sigma_range=4.0, scale_offset=True)

Instantiates a Cov from a pest control file parameter data section using parameter bounds as a proxy for uncertainty.

Parameters:
  • pst_file (str) – pest control file name

  • sigma_range (float) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution

  • scale_offset (bool) – flag to apply scale and offset to parameter upper and lower bounds before calculating varaince. In some cases, not applying scale and offset can result in undefined (log) variance. Default is True.

Returns:

diagonal parameter Cov matrix created from parameter bounds

Return type:

Cov

Note

Calls Cov.from_parameter_data()

classmethod from_uncfile(filename, pst=None)

instaniates a Cov from a PEST-compatible uncertainty file

Parameters:
  • filename (str) – uncertainty file name

  • pst ('pyemu.Pst`) – a control file instance. this is needed if “first_parameter” and “last_parameter” keywords. Default is None

Returns:

Cov instance from uncertainty file

Return type:

Cov

Example:

cov = pyemu.Cov.from_uncfile("my.unc")
property identity

get an identity Cov of the same shape

Returns:

new Cov instance with identity matrix

Return type:

Cov

Note

the returned identity matrix has the same row-col names as self

classmethod identity_like(other)

Get an identity matrix Cov instance like other Cov

Parameters:

other (Matrix) – other matrix - must be square

Returns:

new identity matrix Cov with shape of other

Return type:

Cov

Note

the returned identity cov matrix is treated as non-diagonal

property names

wrapper for getting row_names. row_names == col_names for Cov

Returns:

list of names

Return type:

[str]

replace(other)

replace elements in the covariance matrix with elements from other. if other is not diagonal, then this Cov becomes non diagonal

Parameters:

Cov – the Cov to replace elements in this Cov with

Note

operates in place. Other must have the same row-col names as self

to_pearson()

Convert Cov instance to Pearson correlation coefficient matrix

Returns:

A Matrix of correlation coefs. Return type is Matrix on purpose so that it is clear the returned instance is not a Cov

Return type:

Matrix

Example:

# plot the posterior parameter correlation matrix
import matplotlib.pyplot as plt
cov = pyemu.Cov.from_ascii("pest.post.cov")
cc = cov.to_pearson()
cc.x[cc.x==1.0] = np.NaN
plt.imshow(cc)
to_uncfile(unc_file, covmat_file='cov.mat', var_mult=1.0, include_path=False)

write a PEST-compatible uncertainty file

Parameters:
  • unc_file (str) – filename of the uncertainty file

  • covmat_file (str) – covariance matrix filename. Default is “Cov.mat”. If None, and Cov.isdiaonal, then a standard deviation form of the uncertainty file is written. Exception raised if covmat_file is None and not Cov.isdiagonal

  • var_mult (float) – variance multiplier for the covmat_file entry

  • include_path (bool) – flag to include the path of unc_file in the name of covmat_file. Default is False - not sure why you would ever make this True…

Example:

cov = pyemu.Cov.from_parameter_data(pst)
cov.to_uncfile("my.unc")
property zero

get a diagonal instance of Cov with all zeros on the diagonal

Returns:

new Cov instance with zeros

Return type:

Cov

class pyemu.Ensemble(pst, df, istransformed=False)

Bases: object

based class for handling ensembles of numeric values

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • df (pandas.DataFrame) – a pandas dataframe. Columns should be parameter/observation names. Index is treated as realization names

  • istransformed (bool) – flag to indicate parameter values are in log space. Not used for ObservationEnsemble

Example:

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)
as_pyemu_matrix(typ=None)

get a pyemu.Matrix instance of Ensemble

Parameters:

typ (pyemu.Matrix or pyemu.Cov) – the type of matrix to return. Default is pyemu.Matrix

Returns:

a matrix instance

Return type:

pyemu.Matrix

Example:

oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst=pst,num_reals=100)
dev_mat = oe.get_deviations().as_pyemu_matrix(typ=pyemu.Cov)
obscov = dev_mat.T * dev_mat
back_transform()

back transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

Ensemble.back_transform() is only provided for inheritance purposes. It only changes the `Ensemble._transformed flag

copy()

get a copy of Ensemble

Returns:

copy of this Ensemble

Return type:

Ensemble

Note

copies both Ensemble.pst and Ensemble._df: can be expensive

covariance_matrix(localizer=None, center_on=None)

get a empirical covariance matrix implied by the correlations between realizations

Parameters:
  • localizer (pyemu.Matrix, optional) – a matrix to localize covariates in the resulting covariance matrix. Default is None

  • center_on (str, optional) – a realization name to use as the centering point in ensemble space. If None, the mean vector is treated as the centering point. Default is None

Returns:

the empirical (and optionally localized) covariance matrix

Return type:

pyemu.Cov

dropna(*args, **kwargs)

override of pandas.DataFrame.dropna()

Parameters:
  • ([object] (*args) – positional arguments to pass to pandas.DataFrame.dropna().

  • ({str (**kwargs) – object}): keyword arguments to pass to pandas.DataFrame.dropna().

classmethod from_binary(pst, filename)

create an Ensemble from a PEST-style binary file

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • filename (str) – filename containing binary ensemble

Returns:

the ensemble loaded from the binary file

Return type:

Ensemble

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_binary("obs.jcb")
classmethod from_csv(pst, filename, *args, **kwargs)

create an Ensemble from a CSV file

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • filename (str) – filename containing CSV ensemble

  • ([object] (*args) – positional arguments to pass to pandas.read_csv().

  • ({str (**kwargs) – object}): keyword arguments to pass to pandas.read_csv().

Returns:

Ensemble

Note

uses pandas.read_csv() to load numeric values from CSV file

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_csv("obs.csv")
classmethod from_dataframe(pst, df, istransformed=False)
get_deviations(center_on=None)

get the deviations of the realizations around a certain point in ensemble space

Parameters:

center_on (str, optional) – a realization name to use as the centering point in ensemble space. If None, the mean vector is treated as the centering point. Default is None

Returns:

an ensemble of deviations around the centering point

Return type:

Ensemble

Note

deviations are the Euclidean distances from the center_on value to realized values for each column

center_on=None yields the classic ensemble smoother/ensemble Kalman filter deviations from the mean vector

Deviations respect log-transformation status.

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.add_base()
oe_dev = oe.get_deviations(center_on="base")
oe.to_csv("obs_base_devs.csv")
property istransformed

the parameter transformation status

Returns:

flag to indicate whether or not the ParameterEnsemble is transformed with respect to log_{10}. Not used for (and has no effect on) ObservationEnsemble.

Return type:

bool

Note

parameter transformation status is only related to log_{10} and does not include the effects of scale and/or offset

plot(bins=10, facecolor='0.5', plot_cols=None, filename='ensemble.pdf', func_dict=None, **kwargs)

plot ensemble histograms to multipage pdf

Parameters:
  • bins (int) – number of bins for the histograms

  • facecolor (str) – matplotlib color (e.g. r,`g`, etc)

  • plot_cols ([str]) – list of subset of ensemble columns to plot. If None, all are plotted. Default is None

  • filename (str) – multipage pdf filename. Default is “ensemble.pdf”

  • func_dict (dict) – a dict of functions to apply to specific columns. For example: {“par1”: np.log10}

  • **kwargs (dict) – addkeyword args to pass to pyemu.plot_utils.ensemble_helper()

Example:

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)
pe.transform() # plot log space (if needed)
pe.plot(bins=30)
pst

control file instance

Type:

pyemu.Pst

static reseed()

reset the numpy.random.seed

Note

reseeds using the pyemu.en.SEED global variable

The pyemu.en.SEED value is set as the numpy.random.seed on import, so make sure you know what you are doing if you call this method…

to_binary(filename)

write Ensemble to a PEST-style binary file

Parameters:

filename (str) – file to write

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_binary("obs.jcb")

Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

to_csv(filename, *args, **kwargs)

write Ensemble to a CSV file

Parameters:
  • filename (str) – file to write

  • ([object] (*args) – positional arguments to pass to pandas.DataFrame.to_csv().

  • ({str (**kwargs) – object}): keyword arguments to pass to pandas.DataFrame.to_csv().

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_csv("obs.csv")

Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

to_dense(filename)

write Ensemble to a dense-format binary file

Parameters:

filename (str) – file to write

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_dense("obs.bin")

Note

back transforms ParameterEnsemble before writing so that values are in arithmatic space

transform()

transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

Ensemble.transform() is only provided for inheritance purposes. It only changes the `Ensemble._transformed flag

class pyemu.ErrVar(jco, **kwargs)

Bases: LinearAnalysis

FOSM-based error variance analysis

Parameters:
  • jco (varies, optional) – something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.

  • pst (varies, optional) – something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)

  • parcov (varies, optional) – prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance

  • obscov (varies, optional) – observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the noise covariance matrix is constructed from the obsevation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance

  • forecasts (varies, optional) – forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection. Note if the PEST++ option “++forecasts()” is set in the pest control file (under the pyemu.Pst.pestpp_options dictionary), then there is no need to pass this argument (unless you want to analyze different forecasts) of pyemu.Matrix or numpy.ndarray.

  • ref_var (float, optional) – reference variance. Default is 1.0

  • verbose (bool) – controls screen output. If str, a filename is assumed and and log file is written.

  • sigma_range (float, optional) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.

  • scale_offset (bool, optional) – flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.

  • omitted_parameters ([str]) – list of parameters to treat as “omitted”. Passing this argument activates 3-term error variance analysis.

  • omitted_parcov (varies) – an argument that can be cast to a parcov for the omitted parameters. If None, omitted_parcov will be formed by extracting a sub-matrix from the LinearAnalsis.parcov attribute.

  • omitted_predictions (varies) – an argument that can be cast to a “predictions” (e.g. “forecasts”) attribute to form prediction sensitivity vectors with respec to the omitted parameters. If None, these vectors will be extracted from the pyemu.LinearAnalysis.predictions attribute

  • kl (bool, optional) – flag to perform Karhunen-Loeve scaling on the jacobian before error variance calculations. If True, the pyemu.ErrVar.jco and pyemu.ErrVar.parcov are altered in place. Default is False.

Example:

ev = pyemu.ErrVar(jco="my.jco",omitted_parameters=["wel1","wel2"])
df = ev.get_errvar_dataframe()
G(singular_value)

get the parameter solution Matrix at a given singular value

Parameters:

singular_value (int) – singular value to calc G at

Returns:

parameter solution matrix (V_1 * S_1^(_1) * U_1^T) at singular_value

Return type:

pyemu.Matrix

I_minus_R(singular_value)

get I - R at a given singular value

Parameters:

singular_value (int) – singular value to calculate I - R at

Returns:

identity matrix minus resolution matrix at singular_value

Return type:

pyemu.Matrix

R(singular_value)

get resolution Matrix (V_1 * V_1^T) at a given singular value

Args: singular_value (int): singular value to calculate R at

Returns:

resolution matrix at singular_value

Return type:

pyemu.Matrix

first_forecast(singular_value)
get the null space term (first term) contribution to forecast (e.g. prediction)

error variance at a given singular value.

Parameters:

singular_value (int) – singular value to calc first term at

Note

This method is used to construct the error variance dataframe

Just a wrapper around ErrVar.first_forecast

Returns:

dictionary of (“first”,prediction_names),error variance pairs at singular_value

Return type:

dict

first_parameter(singular_value)
get the null space term (first term) contribution to parameter error variance

at a given singular value

Parameters:

singular_value (int) – singular value to calc first term at

Returns:

first term contribution to parameter error variance

Return type:

pyemu.Cov

first_prediction(singular_value)
get the null space term (first term) contribution to prediction error variance

at a given singular value.

Parameters:

singular_value (int) – singular value to calc first term at

Note

This method is used to construct the error variance dataframe

Returns:

dictionary of (“first”,prediction_names),error variance pairs at singular_value

Return type:

dict

get_errvar_dataframe(singular_values=None)

primary entry point for error variance analysis.

Parameters:

singular_values ([int], optional) – a list singular values to test. If None, defaults to range(0,min(nnz_obs,nadj_par) + 1).

Returns:

a multi-indexed pandas dataframe summarizing each of the error variance terms for each nominated forecast. Rows are the singluar values tested, columns are a multi-index of forecast name and error variance term number (e.g. 1,2 or (optionally) 3).

Return type:

pandas.DataFrame

Example:

ev = pyemu.ErrVar(jco="my.jco",omitted_parameters=["wel1","wel2"])
df = ev.get_errvar_dataframe()
get_identifiability_dataframe(singular_value=None, precondition=False)

primary entry point for identifiability analysis

Parameters:
  • singular_value (int) – the singular spectrum truncation point. Defaults to minimum of non-zero-weighted observations and adjustable parameters

  • precondition (bool) – flag to use the preconditioned hessian with the prior parameter covariance matrix (xtqt + sigma_theta^-1). This should be used KL scaling. Default is False.

Returns:

A pandas dataframe of the right solution-space singular vectors and identifiability (identifiabiity is in the column labeled “ident”)

Return type:

pandas.DataFrame

Examples:

ev = pyemu.ErrVar(jco="my.jco")
df = ev.get_identifiability_dataframe(singular_value=20)
df.ident.plot(kind="bar")
get_null_proj(maxsing=None, eigthresh=1e-06)

get a null-space projection matrix of XTQX

Parameters:
  • maxsing (int, optional) – number of singular components to use (the truncation point). If None, pyemu.Matrx.get_maxsing() is used to determine the truncation point with `eigthresh. Default is None

  • eigthresh (float, optional) – the ratio of smallest to largest singular value to keep in the range (solution) space of XtQX. Not used if maxsing is not None. Default is 1.0e-6

Note

used for null-space monte carlo operations.

Returns:

pyemu.Matrix the null-space projection matrix (V2V2^T)

property omitted_jco

the omitted-parameters jacobian matrix

Returns:

the jacobian matrix instance of non-zero-weighted observations and omitted parameters

Return type:

pyemu.Jco

property omitted_parcov

the prior omitted-parameter covariance matrix

Returns:

the prior parameter covariance matrix of the omitted parameters

Return type:

pyemu.Cov

property omitted_predictions

omitted prediction sensitivity vectors

Returns:

a matrix of prediction sensitivity vectors (column wise) to omitted parameters

Return type:

pyemu.Matrix

second_forecast(singular_value)

get the solution space contribution to forecast (e.g. “prediction”) error variance at a given singular value

Parameters:

singular_value (int) – singular value to calc second term at

Note

This method is used to construct error variance dataframe

Just a thin wrapper around ErrVar.second_prediction

Returns:

dictionary of (“second”,prediction_names), error variance arising from the solution space contribution (y^t * G * obscov * G^T * y)

Return type:

dict

second_parameter(singular_value)
get the solution space contribution to parameter error variance

at a given singular value (G * obscov * G^T)

Parameters:

singular_value (int) – singular value to calc second term at

Returns:

the second term contribution to parameter error variance (G * obscov * G^T)

Return type:

pyemu.Cov

second_prediction(singular_value)

get the solution space contribution to predictive error variance at a given singular value

Parameters:

singular_value (int) – singular value to calc second term at

Note

This method is used to construct error variance dataframe

Returns: dict: dictionary of (“second”,prediction_names), error variance

arising from the solution space contribution (y^t * G * obscov * G^T * y)

third_forecast(singular_value)
get the omitted parameter contribution to forecast (prediction) error variance

at a given singular value.

Parameters:

singular_value (int) – singular value to calc third term at

Note

used to construct error variance dataframe just a thin wrapper around ErrVar.third_prediction()

Returns:

a dictionary of (“third”,prediction_names),error variance

Return type:

dict

third_parameter(singular_value)
get the omitted parameter contribution to parameter error variance

at a given singular value

Parameters:

singular_value (int) – singular value to calc third term at

Returns:

the third term contribution to parameter error variance calculated at singular_value (G * omitted_jco * Sigma_(omitted_pars) * omitted_jco^T * G^T). Returns 0.0 if third term calculations are not being used.

Return type:

pyemu.Cov

third_prediction(singular_value)
get the omitted parameter contribution to prediction error variance

at a given singular value.

Parameters:

singular_value (int) – singular value to calc third term at

Note

used to construct error variance dataframe

Returns:

a dictionary of (“third”,prediction_names),error variance

Return type:

dict

variance_at(singular_value)
get the error variance of all three error variance terms at a

given singluar value

Parameters:

singular_value (int) – singular value to test

Returns:

dictionary of (err var term,prediction_name), variance pairs

Return type:

dict

class pyemu.Jco(x=None, row_names=[], col_names=[], isdiagonal=False, autoalign=True)

Bases: Matrix

a thin wrapper class to get more intuitive attribute names. Functions exactly like Matrix

classmethod from_pst(pst, random=False)

construct a new empty Jco from a control file optionally filled with trash

Parameters:
  • pst (pyemu.Pst) – a pest control file instance. If type is ‘str’, pst is loaded from filename

  • random (bool) – flag for contents of the trash matrix. If True, fill with random numbers, if False, fill with zeros Default is False

Returns:

the new Jco instance

Return type:

Jco

property nobs

number of observations in the Jco

Returns:

number of observations (rows)

Return type:

int

property npar

number of parameters in the Jco

Returns:

number of parameters (columns)

Return type:

int

property obs_names

thin wrapper around Matrix.row_names

Returns:

a list of observation names

Return type:

[‘str’]

property par_names

thin wrapper around Matrix.col_names

Returns:

a list of parameter names

Return type:

[str]

class pyemu.LinearAnalysis(jco=None, pst=None, parcov=None, obscov=None, predictions=None, ref_var=1.0, verbose=False, resfile=False, forecasts=None, sigma_range=4.0, scale_offset=True, **kwargs)

Bases: object

The base/parent class for linear analysis.

Parameters:
  • jco (varies, optional) – something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.

  • pst (varies, optional) – something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)

  • parcov (varies, optional) – prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance

  • obscov (varies, optional) – observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the noise covariance matrix is constructed from the obsevation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance

  • forecasts (varies, optional) – forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection of pyemu.Matrix or numpy.ndarray.

  • ref_var (float, optional) – reference variance. Default is 1.0

  • verbose (bool) – controls screen output. If str, a filename is assumed and and log file is written.

  • sigma_range (float, optional) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.

  • scale_offset (bool, optional) – flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.

Note

Can be used directly, but for prior uncertainty analyses only.

The derived types (pyemu.Schur, pyemu.ErrVar) are for different forms of FOMS-based posterior uncertainty analyses.

This class tries hard to not load items until they are needed; all arguments are optional.

The class makes heavy use of property decorator to encapsulated private attributes

Example:

#assumes "my.pst" exists
la = pyemu.LinearAnalysis(jco="my.jco",forecasts=["fore1","fore2"])
print(la.prior_forecast)
property adj_par_names

adjustable parameter names

Returns:

list of adjustable parameter names

Return type:

[‘str`]

Note

if LinearAnalysis.pst is None, returns LinearAnalysis.jco.col_names

adjust_obscov_resfile(resfile=None)

reset the elements of obscov by scaling the implied weights based on the phi components in res_file so that the total phi is equal to the number of non-zero weights.

Parameters:

resfile (str) – residual file to use. If None, residual file with case name is sought. default is None

Note

calls pyemu.Pst.adjust_weights_resfile()

apply_karhunen_loeve_scaling()

apply karhuene-loeve scaling to the jacobian matrix.

Note

This scaling is not necessary for analyses using Schur’s complement, but can be very important for error variance analyses. This operation effectively transfers prior knowledge specified in the parcov to the jacobian and reset parcov to the identity matrix.

clean()

drop regularization and prior information observation from the jco

drop_prior_information()

drop the prior information from the jco and pst attributes

property fehalf

Karhunen-Loeve scaling matrix attribute.

Returns:

the Karhunen-Loeve scaling matrix based on the prior parameter covariance matrix

Return type:

pyemu.Matrix

property forecast_names

get the forecast (aka prediction) names

Returns:

list of forecast names

Return type:

([str])

property forecasts

the forecast sentivity vectors attribute

Returns:

a matrix of forecast (prediction) sensitivity vectors (column wise)

Return type:

pyemu.Matrix

property forecasts_iter

forecast (e.g. prediction) sensitivity vectors iterator

Returns:

iterator on forecasts (e.g. predictions) sensitivity vectors (matrix)

Return type:

iterator

Note

This is used for processing huge numbers of predictions

This is a synonym for LinearAnalysis.predictions_iter()

get(par_names=None, obs_names=None, astype=None)

method to get a new LinearAnalysis class using a subset of parameters and/or observations

Parameters:
  • par_names ([‘str]) – par names for new object

  • obs_names ([‘str]) – obs names for new object

  • astype (pyemu.Schur or pyemu.ErrVar) – type to cast the new object. If None, return type is same as self

Returns:

new instance

Return type:

LinearAnalysis

get_cso_dataframe()

get a dataframe of composite observation sensitivity, as returned by PEST in the seo file.

Returns:

dataframe of observation names and composite observation sensitivity

Return type:

pandas.DataFrame

Note

That this formulation deviates slightly from the PEST documentation in that the values are divided by (npar-1) rather than by (npar).

The equation is cso_j = ((Q^1/2*J*J^T*Q^1/2)^1/2)_jj/(NPAR-1)

get_obs_competition_dataframe()

get the observation competition stat a la PEST utility

Returns:

a dataframe of observation names by observation names with values equal to the PEST competition statistic

Return type:

pandas.DataFrame

get_par_css_dataframe()

get a dataframe of composite scaled sensitivities. Includes both PEST-style and Hill-style.

Returns:

a dataframe of parameter names, PEST-style and Hill-style composite scaled sensitivity

Return type:

pandas.DataFrame

property jco

the jacobian matrix attribute

Returns:

the jacobian matrix attribute

Return type:

pyemu.Jco

property mle_covariance

maximum likelihood parameter covariance matrix.

Returns:

maximum likelihood parameter covariance matrix

Return type:

pyemu.Matrix

property mle_parameter_estimate

maximum likelihood parameter estimate.

Returns:

the maximum likelihood parameter estimates

Return type:

pandas.Series

property nnz_obs_names

non-zero-weighted observation names

Returns:

list of non-zero-weighted observation names

Return type:

[‘str`]

Note

if LinearAnalysis.pst is None, returns LinearAnalysis.jco.row_names

property obscov

get the observation noise covariance matrix attribute

Returns:

a reference to the LinearAnalysis.obscov attribute

Return type:

pyemu.Cov

property parcov

get the prior parameter covariance matrix attribute

Returns:

a reference to the LinearAnalysis.parcov attribute

Return type:

pyemu.Cov

property predictions

the prediction (aka forecast) sentivity vectors attribute

Returns:

a matrix of prediction sensitivity vectors (column wise)

Return type:

pyemu.Matrix

property predictions_iter

prediction sensitivity vectors iterator

Returns:

iterator on prediction sensitivity vectors (matrix)

Return type:

iterator

Note

this is used for processing huge numbers of predictions

property prior_forecast

prior forecast (e.g. prediction) variances

Returns:

a dictionary of forecast name, prior variance pairs

Return type:

dict

property prior_parameter

prior parameter covariance matrix

Returns:

prior parameter covariance matrix

Return type:

pyemu.Cov

property prior_prediction

prior prediction (e.g. forecast) variances

Returns:

a dictionary of prediction name, prior variance pairs

Return type:

dict

property pst

the pst attribute

Returns:

the pst attribute

Return type:

pyemu.Pst

property qhalf

square root of the cofactor matrix attribute. Create the attribute if it has not yet been created

Returns:

square root of the cofactor matrix

Return type:

pyemu.Matrix

property qhalfx

half normal matrix attribute.

Returns:

half normal matrix attribute

Return type:

pyemu.Matrix

reset_obscov(arg=None)

reset the obscov attribute to None

Parameters:

arg (str or pyemu.Matrix) – the value to assign to the obscov attribute. If None, the private __obscov attribute is cleared but not reset

reset_parcov(arg=None)

reset the parcov attribute to None

Parameters:

arg (str or pyemu.Matrix) – the value to assign to the parcov attribute. If None, the private __parcov attribute is cleared but not reset

reset_pst(arg)

reset the LinearAnalysis.pst attribute

Parameters:

arg (str or pyemu.Pst) – the value to assign to the pst attribute

property xtqx

normal matrix attribute.

Returns:

normal matrix attribute

Return type:

pyemu.Matrix

class pyemu.Matrix(x=None, row_names=[], col_names=[], isdiagonal=False, autoalign=True)

Bases: object

Easy linear algebra in the PEST(++) realm

Parameters:
  • x (numpy.ndarray) – numeric values

  • row_names ([str]) – list of row names

  • col_names (['str']) – list of column names

  • isdigonal (bool) – flag if the Matrix is diagonal

  • autoalign (bool) – flag to control the autoalignment of Matrix during linear algebra operations

Example:

data = np.random.random((10,10))
row_names = ["row_{0}".format(i) for i in range(10)]
col_names = ["col_{0}".format(j) for j in range(10)]
mat = pyemu.Matrix(x=data,row_names=row_names,col_names=col_names)
mat.to_binary("mat.jco")

# load an existing jacobian matrix
jco = pyemu.Jco.from_binary("pest.jco")
# form an observation noise covariance matrix from weights
obscov = pyemu.Cov.from_observation_data(pst)
# form the normal matrix, aligning rows and cols on-the-fly
xtqx = jco * obscov.inv * jco.T

Note

this class makes heavy use of property decorators to encapsulate private attributes

property T

wrapper function for Matrix.transpose() method

Returns:

transpose of Matrix

Return type:

Matrix

Note

returns a copy of self

A syntatic-sugar overload of Matrix.transpose()

Example:

jcb = pyemu.Jco.from_binary("pest.jcb")
jcbt = jcb.T
align(names, axis=None)

reorder Matrix by names in place. If axis is None, reorder both indices

Parameters:
  • names (['str']) – names in Matrix.row_names and or Matrix.col_names

  • axis (int, optional) – the axis to reorder. if None, reorder both axes

Note

Works in place. Is called programmatically during linear algebra operations

Example:

# load a jco that has more obs (rows) than non-zero weighted obs
# in the control file
jco = pyemu.Jco.from_binary("pest.jcb")
# get an obs noise cov matrix
obscov = pyemu.Cov.from_observation_data(pst)
jco.align(obscov.row_names,axis=0)
property as_2d

get a 2D numeric representation of Matrix.x. If not isdiagonal, simply return reference to Matrix.x, otherwise, constructs and returns a 2D, diagonal ndarray

Returns:

numpy.ndarray

Return type:

numpy.ndarray

Exmaple:

# A diagonal cov
cov = pyemu.Cov.from_parameter_data
x2d = cov.as_2d # a numpy 2d array
print(cov.shape,cov.x.shape,x2d.shape)
binary_header_dt = dtype([('itemp1', '<i4'), ('itemp2', '<i4'), ('icount', '<i4')])
binary_rec_dt = dtype([('j', '<i4'), ('dtemp', '<f8')])
char

alias of uint8

coo_rec_dt = dtype([('i', '<i4'), ('j', '<i4'), ('dtemp', '<f8')])
copy()

get a copy of Matrix

Returns:

copy of this Matrix

Return type:

Matrix

df()

wrapper of Matrix.to_dataframe()

double

alias of float64

drop(names, axis)

drop elements from Matrix in place

Parameters:
  • names (['str']) – list of names to drop

  • axis (int) – the axis to drop from. must be in [0,1]

element_isaligned(other)

check if matrices are aligned for element-wise operations

Parameters:

other (Matrix) – the other matrix to check for alignment with

Returns:

True if aligned, False if not aligned

Return type:

bool

extend(other)

extend Matrix with the elements of other, returning a new matrix.

Args: other (Matrix): the Matrix to extend self by

Returns:

new, extended Matrix

Return type:

Matrix

Note

No row or column names can be shared between self and other

Example:

jco1 = pyemu.Jco.from_binary("pest_history.jco")
jco2 = pyemu.Jco.from_binary("pest_forecast.jco")

jco_ext = jco1.extend(jco2)
extract(row_names=None, col_names=None)

wrapper method that Matrix.gets() then Matrix.drops() elements. one of row_names or col_names must be not None.

Parameters:
  • row_names (['str'], optional) – row_names to extract. If None, all row_names are retained.

  • col_names (['str'], optional) – col_names to extract. If None, all col_names are retained.

Returns:

the extract sub-matrix defined by row_names and/or col_names

Return type:

Matrix

Example:

cov = pyemu.Cov.from_parameter_data(pst)
hk_cov = cov.extract(row_names=["hk1","hk2","hk3"])
static find_rowcol_indices(names, row_names, col_names, axis=None)

fast(er) look of row and colum names indices

Parameters:
  • names ([str]) – list of names to look for in row_names and/or col_names names

  • row_names ([str]) – list of row names

  • col_names ([str]) – list of column names

  • axis (int, optional) – axis to search along. If None, search both. Default is None

Returns:

array of (integer) index locations. If axis is None, a 2 numpy.ndarrays of both row and column name indices is returned

Return type:

numpy.ndarray

classmethod from_ascii(filename)

load a PEST-compatible ASCII matrix/vector file into a Matrix instance

Parameters:

filename (str) – name of the file to read

Returns:

Matrix loaded from ASCII file

Return type:

Matrix

Example:

mat = pyemu.Matrix.from_ascii("my.mat")
cov = pyemi.Cov.from_ascii("my.cov")
classmethod from_binary(filename, forgive=False)

class method load from PEST-compatible binary file into a Matrix instance

Parameters:
  • filename (str) – filename to read

  • forgive (bool) – flag to forgive incomplete data records. Only applicable to dense binary format. Default is False

Returns:

Matrix loaded from binary file

Return type:

Matrix

Example:

mat = pyemu.Matrix.from_binary("my.jco")
cov = pyemu.Cov.from_binary("large_cov.jcb")
classmethod from_dataframe(df)
class method to create a new Matrix instance from a

pandas.DataFrame

Parameters:

df (pandas.DataFrame) – dataframe

Returns:

Matrix instance derived from df.

Return type:

Matrix

Example:

df = pd.read_csv("my.csv")
mat = pyemu.Matrix.from_dataframe(df)
static from_fortranfile(filename)
a binary load method to accommodate one of the many

bizarre fortran binary writing formats

Parameters:

filename (str) – name of the binary matrix file

Returns:

tuple containing

  • numpy.ndarray: the numeric values in the file

  • [‘str’]: list of row names

  • [`str`]: list of col_names

classmethod from_names(row_names, col_names, isdiagonal=False, autoalign=True, random=False)

class method to create a new Matrix instance from row names and column names, filled with trash

Parameters:
  • row_names (['str']) – row names for the new Matrix

  • col_names (['str']) – col_names for the new matrix

  • isdiagonal (bool, optional) – flag for diagonal matrix. Default is False

  • autoalign (bool, optional) – flag for autoaligning new matrix during linear algebra calcs. Default is True

  • random (bool) – flag for contents of the trash matrix. If True, fill with random numbers, if False, fill with zeros Default is False

Returns:

the new Matrix instance

Return type:

Matrix

Example:

row_names = ["row_1","row_2"]
col_names = ["col_1,"col_2"]
m = pyemu.Matrix.from_names(row_names,col_names)
property full_s

Get the full singular value matrix

Returns:

full singular value matrix. Shape is (max(Matrix.shape),max(Matrix.shape)) with zeros along the diagonal from min(Matrix.shape) to max(Matrix.shape)

Return type:

Matrix

Example:

jco = pyemu.Jco.from_binary("pest.jcb")
jco.full_s.to_ascii("full_sing_val_matrix.mat")
get(row_names=None, col_names=None, drop=False)

get a new Matrix instance ordered on row_names or col_names

Parameters:
  • row_names (['str'], optional) – row_names for new Matrix. If None, all row_names are used.

  • col_names (['str'], optional) – col_names for new Matrix. If None, all col_names are used.

  • drop (bool) – flag to remove row_names and/or col_names from this Matrix

Returns:

a new Matrix

Return type:

Matrix

Example::

# load a jco that has more obs (rows) than non-zero weighted obs # in the control file jco = pyemu.Jco.from_binary(“pest.jcb”) # get an obs noise cov matrix obscov = pyemu.Cov.from_observation_data(pst) nnz_jco = jco.get(row_names = obscov.row_names)

get_diagonal_vector(col_name='diag')

Get a new Matrix instance that is the diagonal of self. The shape of the new matrix is (self.shape[0],1). Self must be square

Parameters:

col_name (str) – the name of the single column in the new Matrix

Returns:

vector-shaped Matrix instance of the diagonal of this Matrix

Return type:

Matrix

Example:

cov = pyemu.Cov.from_unc_file("my.unc")
cov_diag = cov.get_diagonal_vector()
print(cov_diag.col_names)
get_maxsing(eigthresh=1e-05)

Get the number of singular components with a singular value ratio greater than or equal to eigthresh

Args:
eigthresh (float): the ratio of smallest to largest

singular value to retain. Since it is assumed that s is sorted from largest to smallest, once a singular value is reached that yields a ratio with the first (largest) singular value, the index of this singular is returned.

Returns:

the index of the singular value whos ratio with the first singular value is less than or equal to eigthresh

Return type:

int

Note

this method calls the static method Matrix.get_maxsing_from_s() with Matrix.s.x

Example:

jco = pyemu.Jco.from_binary("pest.jco")
max_sing = jco.get_maxsing(eigthresh=pst.svd_data.eigthresh)
static get_maxsing_from_s(s, eigthresh=1e-05)

static method to work out the maxsing for a given singular spectrum

Parameters:
  • s (numpy.ndarray) – 1-D array of singular values. This array should come from calling either numpy.linalg.svd or from the pyemu.Matrix.s.x attribute

  • eigthresh (float) – the ratio of smallest to largest singular value to retain. Since it is assumed that s is sorted from largest to smallest, once a singular value is reached that yields a ratio with the first (largest) singular value, the index of this singular is returned.

Returns:

the index of the singular value whos ratio with the first singular value is less than or equal to eigthresh

Return type:

int

Example:

jco = pyemu.Jco.from_binary("pest.jco")
max_sing = pyemu.Matrix.get_maxsing_from_s(jco.s,eigthresh=pst.svd_data.eigthresh)
hadamard_product(other)

Overload of numpy.ndarray.__mult__(): element-wise multiplication. Tries to speedup by checking for scalars of diagonal matrices on either side of operator

Parameters:

other – (int,`float`,`numpy.ndarray`,`Matrix`): the thing to multiply

Returns:

the result of multiplication

Return type:

Matrix

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are shared, an excetion is raised

Example:

cov = pyemu.Cov.from_parameter_data(pst)
cov2 = cov * 10
cov3 = cov * cov2
indices(names, axis=None)
get the row and col indices of names. If axis is None, two ndarrays

are returned, corresponding the indices of names for each axis

Parameters:
  • names ([str]) – list of names to look for in row_names and/or col_names names

  • row_names ([str]) – list of row names

  • col_names ([str]) – list of column names

  • axis (int, optional) – axis to search along. If None, search both. Default is None

Returns:

array of (integer) index locations. If axis is None, a 2 numpy.ndarrays of both row and column name indices is returned

Return type:

numpy.ndarray

Note

thin wrapper around Matrix.find_rowcol_indices static method

integer

alias of int32

property inv

inversion operation of Matrix

Returns:

inverse of Matrix

Return type:

Matrix

Note

uses numpy.linalg.inv for the inversion

Example:

mat = pyemu.Matrix.from_binary("my.jco")
mat_inv = mat.inv
mat_inv.to_binary("my_inv.jco")
mult_isaligned(other)

check if matrices are aligned for dot product multiplication

Parameters:

other (Matrix) – the other matrix to check for alignment with

Returns:

True if aligned, False if not aligned

Return type:

bool

property ncol

length of second dimension

Returns:

number of columns

Return type:

int

new_obs_length = 200
new_par_length = 200
property newx

return a copy of Matrix.x attribute

Returns:

a copy Matrix.x

Return type:

numpy.ndarray

property nrow

length of first dimension

Returns:

number of rows

Return type:

int

obs_length = 20
par_length = 12
pseudo_inv(maxsing=None, eigthresh=1e-05)

The pseudo inverse of self. Formed using truncated singular value decomposition and Matrix.pseudo_inv_components

Parameters:
  • maxsing (int, optional) – the number of singular components to use. If None, maxsing is calculated using Matrix.get_maxsing() and eigthresh

  • eigthresh – (float, optional): the ratio of largest to smallest singular components to use for truncation. Ignored if maxsing is not None. Default is 1.0e-5

Returns:

the truncated-SVD pseudo inverse of Matrix (V_1 * s_1^-1 * U^T)

Return type:

Matrix

Example:

jco = pyemu.Jco.from_binary("pest.jcb")
jco_psinv = jco.pseudo_inv(pst.svd_data.maxsing,pst.svd_data.eigthresh)
jco_psinv.to_binary("pseudo_inv.jcb")
pseudo_inv_components(maxsing=None, eigthresh=1e-05, truncate=True)

Get the (optionally) truncated SVD components

Parameters:
  • maxsing (int, optional) – the number of singular components to use. If None, maxsing is calculated using Matrix.get_maxsing() and eigthresh

  • eigthresh – (float, optional): the ratio of largest to smallest singular components to use for truncation. Ignored if maxsing is not None. Default is 1.0e-5

  • truncate (bool) – flag to truncate components. If False, U, s, and V will be zeroed out at locations greater than maxsing instead of truncated. Default is True

Returns:

tuple containing

  • Matrix: (optionally truncated) left singular vectors

  • Matrix: (optionally truncated) singular value matrix

  • Matrix: (optionally truncated) right singular vectors

Example:

mat = pyemu.Matrix.from_binary("my.jco")
u1,s1,v1 = mat.pseudo_inv_components(maxsing=10)
resolution_matrix = v1 * v1.T
resolution_matrix.to_ascii("resol.mat")
static read_ascii(filename)

read a PEST-compatible ASCII matrix/vector file

Parameters:

filename (str) – file to read from

Returns:

tuple containing

  • numpy.ndarray: numeric values

  • [‘str’]: list of row names

  • [`str`]: list of column names

  • bool: diagonal flag

static read_binary(filename, forgive=False)

static method to read PEST-format binary files

Parameters:
  • filename (str) – filename to read

  • forgive (bool) – flag to forgive incomplete data records. Only applicable to dense binary format. Default is False

Returns:

tuple containing

  • numpy.ndarray: the numeric values in the file

  • [‘str’]: list of row names

  • [`str`]: list of col_names

static read_binary_header(filename)

read the first elements of a PEST(++)-style binary file to get format and dimensioning information.

Parameters:

filename (str) – the filename of the binary file

Returns:

tuple containing

  • int: the itemp1 value

  • int: the itemp2 value

  • int: the icount value

static read_dense(filename, forgive=False, close=True)

read a dense-format binary file.

Parameters:
  • filename (str) – the filename or the open filehandle

  • forgive (bool) – flag to forgive incomplete records. If True and an incomplete record is encountered, only the previously read records are returned. If False, an exception is raised for an incomplete record

  • close (bool) – flag to close the filehandle. Default is True

Returns:

tuple containing

  • numpy.ndarray: the numeric values in the file

  • [‘str’]: list of row names

  • [`str`]: list of col_names

reset_x(x, copy=True)

reset self.__x private attribute

Parameters:
  • x (numpy.ndarray) – the new numeric data

  • copy (bool) – flag to make a copy of ‘x’. Default is True

Returns:

None

Note

operates in place

property s

the singular value (diagonal) Matrix

Returns:

singular value matrix. shape is (min(Matrix.shape),min(Matrix.shape))

Return type:

Matrix

Example:

# plot the singular spectrum of the jacobian
import matplotlib.pyplot as plt
jco = pyemu.Jco.from_binary("pest.jcb")
plt.plot(jco.s.x)
plt.show()
property shape

get the implied, 2D shape of Matrix

Returns:

length of 2 tuple

Return type:

int

Exmaple:

jco = pyemu.Jco.from_binary("pest.jcb")
shape = jco.shape
nrow,ncol = shape #unpack to ints
property sqrt

element-wise square root operation

Returns:

element-wise square root of Matrix

Return type:

Matrix

Note

uses numpy.sqrt

Example:

cov = pyemu.Cov.from_uncfile("my.unc")
sqcov = cov.sqrt
sqcov.to_uncfile("sqrt_cov.unc")
to_2d()
get a 2D Matrix representation of Matrix. If not Matrix.isdiagonal, simply

return a copy of Matrix, otherwise, constructs and returns a new Matrix instance that is stored as diagonal

Returns:

non-diagonal form of Matrix

Return type:

Martrix

Exmaple:

# A diagonal cov
cov = pyemu.Cov.from_parameter_data
cov2d = cov.as_2d # a numpy 2d array
print(cov.shape,cov.x.shape,cov2d.shape,cov2d.x.shape)
to_ascii(filename, icode=2)

write a PEST-compatible ASCII Matrix/vector file

Parameters:
  • filename (str) – filename to write to

  • icode (int, optional) – PEST-style info code for matrix style. Default is 2.

Note

if icode == -1, a 1-d vector is written that represents a diagonal matrix. An exception is raised if icode == -1 and isdiagonal is False

to_binary(filename, droptol=None, chunk=None)

write a PEST-compatible binary file. The format is the same as the format used to storage a PEST Jacobian matrix

Parameters:
  • filename (str) – filename to save binary file

  • droptol (float) – absolute value tolerance to make values smaller droptol than zero. Default is None (no dropping)

  • chunk (int) – number of elements to write in a single pass. Default is None, which writes the entire numeric part of the Matrix at once. This is faster but requires more memory.

to_coo(filename, droptol=None, chunk=None)

write an extended PEST-format binary file. The data format is [int,int,float] for i,j,value. It is autodetected during the read with Matrix.from_binary().

Parameters:
  • filename (str) – filename to save binary file

  • droptol (float) – absolute value tolerance to make values smaller droptol than zero. Default is None (no dropping)

  • chunk (int) – number of elements to write in a single pass. Default is None, which writes the entire numeric part of the Matrix at once. This is faster but requires more memory.

Note

This method is needed when the number of dimensions times 2 is larger than the max value for a 32-bit integer. happens! This method is used by pyemu.Ensemble.to_binary()

to_dataframe()

return a pandas.DataFrame representation of Matrix

Returns:

a dataframe derived from Matrix

Return type:

pandas.DataFrame

Note

if self.isdiagonal is True, the full matrix is used to fill the dataframe - lots of zeros.

to_dense(filename, close=True)

experimental new dense matrix storage format to support faster I/O with ensembles

Parameters:
  • filename (str) – the filename to save to

  • close (bool) – flag to close the filehandle after saving

Returns:

the file handle. Only returned if close is False

Return type:

f (file)

Note

calls Matrix.write_dense()

property transpose

transpose operation of self

Returns:

transpose of Matrix

Return type:

Matrix

Example:

jcb = pyemu.Jco.from_binary("pest.jcb")
jcbt = jcb.T
property u

the left singular vector Matrix

Returns:

left singular vectors. Shape is (Matrix.shape[0], Matrix.shape[0])

Return type:

Matrix

Example:

jco = pyemu.Jco.from_binary("pest.jcb")
jco.u.to_binary("u.jcb")
property v

the right singular vector Matrix

Returns:

right singular vectors. Shape is (Matrix.shape[1], Matrix.shape[1])

Return type:

Matrix

Example:

jco = pyemu.Jco.from_binary("pest.jcb")
jco.v.to_binary("v.jcb")
static write_dense(filename, row_names, col_names, data, close=False)

experimental new dense matrix storage format to support faster I/O with ensembles

Parameters:
  • filename (str or file) – the file to write to

  • row_names ([str]) – row names of the matrix

  • col_names ([str]) – col names of the matrix

  • data (np.ndarray) – matrix elements

  • close (bool) – flag to close the file after writing

Returns:

the file handle. Only returned if close is False

Return type:

f (file)

property x

return a reference to Matrix.x

Returns:

reference to Matrix.x

Return type:

numpy.ndarray

property zero2d

get an 2D instance of self with all zeros

Returns:

Matrix of zeros

Return type:

Matrix

Example:

jco = pyemu.Jco.from_binary("pest.jcb")
zero_jco = jco.zero2d
class pyemu.ObservationEnsemble(pst, df, istransformed=False)

Bases: Ensemble

Observation noise ensemble in the PEST(++) realm

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • df (pandas.DataFrame) – a pandas dataframe. Columns should be observation names. Index is treated as realization names

  • istransformed (bool) – flag to indicate parameter values are in log space. Not used for ObservationEnsemble

Example:

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
add_base()

add the control file obsval values as a realization

Note

replaces the last realization with the current ObservationEnsemble.pst.observation_data.obsval values as a new realization named “base”

the PEST++ enemble tools will add this realization also if you dont wanna fool with it here…

classmethod from_gaussian_draw(pst, cov=None, num_reals=100, by_groups=True, fill=False, factor='eigen')

generate an ObservationEnsemble from a (multivariate) gaussian distribution

Parameters:
  • pst (pyemu.Pst) – a control file instance.

  • cov (pyemu.Cov) – a covariance matrix describing the second moment of the gaussian distribution. If None, cov is generated from the non-zero-weighted observation weights in pst. Only observations listed in cov are sampled. Other observations are assigned the obsval value from pst.

  • num_reals (int) – number of stochastic realizations to generate. Default is 100

  • by_groups (bool) – flag to generate realzations be observation group. This assumes no correlation (covariates) between observation groups.

  • fill (bool) – flag to fill in zero-weighted observations with control file values. Default is False.

  • factor (str) – how to factorize cov to form the projectin matrix. Can be “eigen” or “svd”. The “eigen” option is default and is faster. But for (nearly) singular cov matrices (such as those generated empirically from ensembles), “svd” is the only way. Ignored for diagonal cov.

Returns:

the realized ObservationEnsemble instance

Return type:

ObservationEnsemble

Note

Only observations named in cov are sampled. Additional, cov is processed prior to sampling to only include non-zero-weighted observations depending on the value of fill. So users must take care to make sure observations have been assigned non-zero weights even if cov is being passed

The default cov is generated from pyemu.Cov.from_observation_data, which assumes observation noise standard deviations are the inverse of the weights listed in pst

Example:

pst = pyemu.Pst("my.pst")
# the easiest way - just relying on weights in pst
oe1 = pyemu.ObservationEnsemble.from_gaussian_draw(pst)

# generate the cov explicitly
cov = pyemu.Cov.from_observation_data(pst)
oe2 = pyemu.ObservationEnsemble.from_gaussian_draw(pst,cov=cov)

# give all but one observation zero weight.  This will
# result in an oe with only one randomly sampled observation noise
# vector since the cov is processed to remove any zero-weighted
# observations before sampling
pst.observation_data.loc[pst.nnz_obs_names[1:],"weight] = 0.0
oe3 = pyemu.ObservationEnsemble.from_gaussian_draw(pst,cov=cov)
get_phi_vector(noise_obs_filename=None, noise_obs_flag=False)
property nonzero

get a new ObservationEnsemble of just non-zero weighted observations

Returns:

non-zero weighted observation ensemble.

Return type:

ObservationEnsemble

Note

The pst attribute of the returned ObservationEnsemble also only includes non-zero weighted observations (and is therefore not valid for running with PEST or PEST++)

property phi_vector

vector of L2 norm (phi) for the realizations (rows) of Ensemble.

Returns:

series of realization name (Ensemble.index) and phi values

Return type:

pandas.Series

Note

The ObservationEnsemble.pst.weights can be updated prior to calling this method to evaluate new weighting strategies

class pyemu.ParameterEnsemble(pst, df, istransformed=False)

Bases: Ensemble

Parameter ensembles in the PEST(++) realm

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • df (pandas.DataFrame) – a pandas dataframe. Columns should be parameter names. Index is treated as realization names

  • istransformed (bool) – flag to indicate parameter values are in log space (if partrans is “log” in pst)

Example:

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)
add_base()

add the control file obsval values as a realization

Note

replaces the last realization with the current ParameterEnsemble.pst.parameter_data.parval1 values as a new realization named “base”

The PEST++ ensemble tools will add this realization also if you dont wanna fool with it here…

property adj_names

the names of adjustable parameters in ParameterEnsemble

Returns:

adjustable parameter names

Return type:

[str]

back_transform()

back transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

enforce(how='reset', bound_tol=0.0)

entry point for bounds enforcement.

Parameters:

enforce_bounds (str) – can be ‘reset’ to reset offending values or ‘drop’ to drop offending realizations. Default is “reset”

Note

In very high dimensions, the “drop” and “scale” how types will result in either very few realizations or very short realizations.

Example:

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw()
pe.enforce(how="scale")
pe.to_csv("par.csv")
property fixed_indexer

boolean indexer for non-adjustable parameters

Returns:

boolean array indicating which parameters have partrans equal to “log” or “fixed”

Return type:

numpy.ndarray(bool)

classmethod from_gaussian_draw(pst, cov=None, num_reals=100, by_groups=True, fill=True, factor='eigen')

generate a ParameterEnsemble from a (multivariate) (log) gaussian distribution

Parameters:
  • pst (pyemu.Pst) – a control file instance.

  • cov (pyemu.Cov) – a covariance matrix describing the second moment of the gaussian distribution. If None, cov is generated from the bounds of the adjustable parameters in pst. the (log) width of the bounds is assumed to represent a multiple of the parameter standard deviation (this is the sigma_range argument that can be passed to pyemu.Cov.from_parameter_data).

  • num_reals (int) – number of stochastic realizations to generate. Default is 100

  • by_groups (bool) – flag to generate realizations be parameter group. This assumes no correlation (covariates) between parameter groups. For large numbers of parameters, this help prevent memories but is slower.

  • fill (bool) – flag to fill in fixed and/or tied parameters with control file values. Default is True.

  • factor (str) – how to factorize cov to form the projection matrix. Can be “eigen” or “svd”. The “eigen” option is default and is faster. But for (nearly) singular cov matrices (such as those generated empirically from ensembles), “svd” is the only way. Ignored for diagonal cov.

Returns:

the parameter ensemble realized from the gaussian distribution

Return type:

ParameterEnsemble

Note

Only parameters named in cov are sampled. Missing parameters are assigned values of pst.parameter_data.parval1 along the corresponding columns of ParameterEnsemble according to the value of fill.

The default cov is generated from pyemu.Cov.from_observation_data, which assumes parameter bounds in ParameterEnsemble.pst represent some multiple of parameter standard deviations. Additionally, the default Cov only includes adjustable parameters (partrans not “tied” or “fixed”).

“tied” parameters are not sampled.

Example:

pst = pyemu.Pst("my.pst")
# the easiest way - just relying on weights in pst
pe1 = pyemu.ParameterEnsemble.from_gaussian_draw(pst)

# generate the cov explicitly with a sigma_range
cov = pyemu.Cov.from_parameter_data(pst,sigma_range=6)
[e2 = pyemu.ParameterEnsemble.from_gaussian_draw(pst,cov=cov)
classmethod from_mixed_draws(pst, how_dict, default='gaussian', num_reals=100, cov=None, sigma_range=6, enforce_bounds=True, partial=False, fill=True)

generate a ParameterEnsemble using a mixture of distributions. Available distributions include (log) “uniform”, (log) “triangular”, and (log) “gaussian”. log transformation is respected.

Parameters:
  • pst (pyemu.Pst) – a control file

  • how_dict (dict) – a dictionary of parameter name keys and “how” values, where “how” can be “uniform”,”triangular”, or “gaussian”.

  • default (str) – the default distribution to use for parameter not listed in how_dict. Default is “gaussian”.

  • num_reals (int) – number of realizations to draw. Default is 100.

  • cov (pyemu.Cov) – an optional Cov instance to use for drawing from gaussian distribution. If None, and “gaussian” is listed in how_dict (and/or default), then a diagonal covariance matrix is constructed from the parameter bounds in pst (with sigma_range). Default is None.

  • sigma_range (float) – the number of standard deviations implied by the parameter bounds in the pst. Only used if “gaussian” is in how_dict (and/or default) and cov is None. Default is 6.

  • enforce_bounds (bool) – flag to enforce parameter bounds in resulting ParameterEnsemble. Only matters if “gaussian” is in values of how_dict. Default is True.

  • partial (bool) – flag to allow a partial ensemble (not all pars included). If True, parameters not name in how_dict will be sampled using the distribution named as default. Default is False.

  • fill (bool) – flag to fill in fixed and/or tied parameters with control file values. Default is True.

Example:

pst = pyemu.Pst("pest.pst")
# uniform for the fist 10 pars
how_dict = {p:"uniform" for p in pst.adj_par_names[:10]}
pe = pyemu.ParameterEnsemble(pst,how_dict=how_dict)
pe.to_csv("my_mixed_pe.csv")
classmethod from_parfiles(pst, parfile_names, real_names=None)

create a parameter ensemble from PEST-style parameter value files. Accepts parfiles with less than the parameters in the control (get NaNs in the ensemble) or extra parameters in the parfiles (get dropped)

Parameters:
  • pst (pyemu.Pst) – control file instance

  • parfile_names ([str]) – par file names

  • real_names (str) – optional list of realization names. If None, a single integer counter is used

Returns:

parameter ensemble loaded from par files

Return type:

ParameterEnsemble

classmethod from_triangular_draw(pst, num_reals=100, fill=True)

generate a ParameterEnsemble from a (multivariate) (log) triangular distribution

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • num_reals (int, optional) – number of realizations to generate. Default is 100

  • fill (bool) – flag to fill in fixed and/or tied parameters with control file values. Default is True.

Returns:

a parameter ensemble drawn from the multivariate (log) triangular distribution defined by the parameter upper and lower bounds and initial parameter values in pst

Return type:

ParameterEnsemble

Note

respects transformation status in pst: fixed and tied parameters are not realized, log-transformed parameters are drawn in log space. The returned ParameterEnsemble is back transformed (not in log space)

uses numpy.random.triangular

Example:

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_triangular_draw(pst)
pe.to_csv("my_tri_pe.csv")
classmethod from_uniform_draw(pst, num_reals, fill=True)

generate a ParameterEnsemble from a (multivariate) (log) uniform distribution

Parameters:
  • pst (pyemu.Pst) – a control file instance

  • num_reals (int, optional) – number of realizations to generate. Default is 100

  • fill (bool) – flag to fill in fixed and/or tied parameters with control file values. Default is True.

Returns:

a parameter ensemble drawn from the multivariate (log) uniform distribution defined by the parameter upper and lower bounds pst

Return type:

ParameterEnsemble

Note

respects transformation status in pst: fixed and tied parameters are not realized, log-transformed parameters are drawn in log space. The returned ParameterEnsemble is back transformed (not in log space)

uses numpy.random.uniform

Example:

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_uniform_draw(pst)
pe.to_csv("my_uni_pe.csv")
property lbnd

the lower bound vector while respecting current log transform status

Returns:

(log-transformed) lower parameter bounds listed in ParameterEnsemble.pst.parameter_data.parlbnd

Return type:

pandas.Series

property log_indexer

boolean indexer for log transform

Returns:

boolean array indicating which parameters are log transformed

Return type:

numpy.ndarray(bool)

project(projection_matrix, center_on=None, log=None, enforce_bounds='reset')

project the ensemble using the null-space Monte Carlo method

Parameters:
  • projection_matrix (pyemu.Matrix) – null-space projection operator.

  • center_on (str) – the name of the realization to use as the centering point for the null-space differening operation. If center_on is None, the ParameterEnsemble mean vector is used. Default is None

  • log (pyemu.Logger, optional) – for logging progress

  • enforce_bounds (str) – parameter bound enforcement option to pass to ParameterEnsemble.enforce(). Valid options are reset, drop, scale or None. Default is reset.

Returns:

untransformed, null-space projected ensemble.

Return type:

ParameterEnsemble

Example:

ev = pyemu.ErrVar(jco="my.jco") #assumes my.pst exists
pe = pyemu.ParameterEnsemble.from_gaussian_draw(ev.pst)
pe_proj = pe.project(ev.get_null_proj(maxsing=25))
pe_proj.to_csv("proj_par.csv")
transform()

transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

property ubnd

the upper bound vector while respecting current log transform status

Returns:

(log-transformed) upper parameter bounds listed in ParameterEnsemble.pst.parameter_data.parubnd

Return type:

pandas.Series

class pyemu.Pst(filename, load=True, resfile=None)

Bases: object

All things PEST(++) control file

Parameters:
  • filename (str) – the name of the control file

  • load (bool, optional) – flag to load the control file. Default is True

  • resfile (str, optional) – corresponding residual file. If None, a residual file with the control file base name is sought. Default is None

Note

This class is the primary mechanism for dealing with PEST control files. Support is provided for constructing new control files as well as manipulating existing control files.

Example:

pst = pyemu.Pst("my.pst")
pst.control_data.noptmax = -1
pst.write("my_new.pst")
add_observations(ins_file, out_file=None, pst_path=None, inschek=True)

add new observations to a control file

Parameters:
  • ins_file (str) – instruction file with exclusively new observation names

  • out_file (str) – model output file. If None, then ins_file.replace(“.ins”,””) is used. Default is None

  • pst_path (str) – the path to append to the instruction file and out file in the control file. If not None, then any existing path in front of the template or in file is split off and pst_path is prepended. If python is being run in a directory other than where the control file will reside, it is useful to pass pst_path as .. Default is None

  • inschek (bool) – flag to try to process the existing output file using the pyemu.InstructionFile class. If successful, processed outputs are used as obsvals

Returns:

the data for the new observations that were added

Return type:

pandas.DataFrame

Note

populates the new observation information with default values

Example:

pst = pyemu.Pst(os.path.join("template", "my.pst"))
pst.add_observations(os.path.join("template","new_obs.dat.ins"), pst_path=".")
pst.write(os.path.join("template", "my_new.pst")
add_parameters(template_file, in_file=None, pst_path=None)

add new parameters to an existing control file

Parameters:
  • template_file (str) – template file with (possibly) some new parameters

  • in_file (str) – model input file. If None, template_file.replace(‘.tpl’,’’) is used. Default is None.

  • pst_path (str) – the path to append to the template_file and in_file in the control file. If not None, then any existing path in front of the template or in file is split off and pst_path is prepended. If python is being run in a directory other than where the control file will reside, it is useful to pass pst_path as .. Default is None

Returns:

the data for the new parameters that were added. If no new parameters are in the new template file, returns None

Return type:

pandas.DataFrame

Note

populates the new parameter information with default values

Example:

pst = pyemu.Pst(os.path.join("template","my.pst"))
pst.add_parameters(os.path.join("template","new_pars.dat.tpl",pst_path=".")
pst.write(os.path.join("template","my_new.pst")
add_pi_equation(par_names, pilbl=None, rhs=0.0, weight=1.0, obs_group='pi_obgnme', coef_dict={})

a helper to construct a new prior information equation.

Parameters:
  • par_names ([str]) – parameter names in the equation

  • pilbl (str) – name to assign the prior information equation. If None, a generic equation name is formed. Default is None

  • rhs (float) – the right-hand side of the pi equation

  • weight (float) – the weight of the equation

  • obs_group (str) – the observation group for the equation. Default is ‘pi_obgnme’

  • coef_dict (dict) – a dictionary of parameter name, coefficient pairs to assign leading coefficients for one or more parameters in the equation. If a parameter is not listed, 1.0 is used for its coefficients. Default is {}

Example:

pst = pyemu.Pst("pest.pst")
# add a pi equation for the first adjustable parameter
pst.add_pi_equation(pst.adj_par_names[0],pilbl="pi1",rhs=1.0)
# add a pi equation for 1.5 times the 2nd and 3 times the 3rd adj pars to sum together to 2.0
names = pst.adj_par_names[[1,2]]
pst.add_pi_equation(names,coef_dict={names[0]:1.5,names[1]:3})
add_transform_columns()

add transformed values to the Pst.parameter_data attribute

Note

adds parval1_trans, parlbnd_trans and parubnd_trans to Pst.parameter_data

Example:

pst = pyemu.Pst("pest.pst")
pst.add_transform_columns()
print(pst.parameter_data.parval1_trans
property adj_par_groups

get the parameter groups with atleast one adjustable parameter

Returns:

a list of parameter groups with at least one adjustable parameter

Return type:

[str]

property adj_par_names

get the adjustable (not fixed or tied) parameter names

Returns:

list of adjustable (not fixed or tied) parameter names

Return type:

[str]

adjust_weights(obs_dict=None, obsgrp_dict=None)

reset the weights of observations or observation groups to contribute a specified amount to the composite objective function

Parameters:
  • obs_dict (dict, optional) – dictionary of observation name,new contribution pairs

  • obsgrp_dict (dict, optional) – dictionary of obs group name,contribution pairs

Notes

If a group is assigned a contribution of 0, all observations in that group will be assigned zero weight.

If a group is assigned a nonzero contribution AND all observations in that group start with zero weight, the observations will be assigned weight of 1.0 to allow balancing.

If groups obsgrp_dict is not passed, all nonzero

Example:

pst = pyemu.Pst("my.pst")

# adjust a single observation
pst.adjust_weights(obs_dict={"obs1":10})

# adjust a single observation group
pst.adjust_weights(obsgrp_dict={"group1":100.0})

# make all non-zero weighted groups have a contribution of 100.0
balanced_groups = {grp:100 for grp in pst.nnz_obs_groups}
pst.adjust_weights(obsgrp_dict=balanced_groups)
adjust_weights_discrepancy(resfile=None, original_ceiling=True, bygroups=False)

adjusts the weights of each non-zero weight observation based on the residual in the pest residual file so each observations contribution to phi is 1.0 (e.g. Mozorov’s discrepancy principal)

Parameters:
  • resfile (str) – residual file name. If None, try to use a residual file with the Pst case name. Default is None

  • original_ceiling (bool) – flag to keep weights from increasing - this is generally a good idea. Default is True

  • bygroups (bool) – flag to adjust weights by groups. If False, the weight of each non-zero weighted observation is adjusted individually. If True, intergroup weighting is preserved (the contribution to each group is used) but this may result in some strangeness if some observations in a group have a really low phi already.

Example:

pst = pyemu.Pst("my.pst")
print(pst.phi) #assumes "my.res" is colocated with "my.pst"
pst.adjust_weights_discrepancy()
print(pst.phi) # phi should equal number of non-zero observations
bounds_report(iterations=None)

report how many parameters are at bounds. If ensemble, the base enbsemble member is evaluated

Parameters:

iterations ([int]) – a list of iterations for which a bounds report is requested If None, all iterations for which par files are located are reported. Default is None

Returns:

a pandas DataFrame object with rows being parameter groups and columns

<iter>_num_at_ub, <iter>_num_at_lb, and <iter>_total_at_bounds row 0 is total at bounds, subsequent rows correspond with groups

Return type:

df

Example

pst = pyemu.Pst(“my.pst”) df = pst.bound_report(iterations=[0,2,3])

build_increments()

experimental method to calculate parameter increments for use in the finite difference pertubation calculations

Note

user beware!

calculate_pertubations()

experimental method to calculate finite difference parameter pertubations.

Note

The pertubation values are added to the Pst.parameter_data attribute - user beware!

control_data

‘* control data’ information. Access with standard PEST variable names

Example:

pst.control_data.noptmax = 2
pst.control_data.pestmode = "estimation"
Type:

pyemu.pst.pst_controldata.ControlData

drop_observations(ins_file, pst_path=None)

remove observations in an instruction file from the control file

Parameters:
  • ins_file (str) – instruction file to remove

  • pst_path (str) – the path to append to the instruction file in the control file. If not None, then any existing path in front of the instruction is split off and pst_path is prepended. If python is being run in a directory other than where the control file will reside, it is useful to pass pst_path as .. Default is None

Returns:

the observation data for the observations that were removed.

Return type:

pandas.DataFrame

Example:

pst = pyemu.Pst(os.path.join("template", "my.pst"))
pst.remove_observations(os.path.join("template","some_obs.dat.ins"), pst_path=".")
pst.write(os.path.join("template", "my_new_with_less_obs.pst")
drop_parameters(tpl_file, pst_path=None)

remove parameters in a template file from the control file

Parameters:
  • tpl_file (str) – template file to remove

  • pst_path (str) – the path to append to the template file in the control file. If not None, then any existing path in front of the template or in file is split off and pst_path is prepended. If python is being run in a directory other than where the control file will reside, it is useful to pass pst_path as .. Default is None

Returns:

the parameter data for the parameters that were removed.

Return type:

pandas.DataFrame

Note

This method does not check for multiple occurences of the same parameter name(s) in across template files so if you have the same parameter in multiple template files, this is not the method you are looking for

Example:

pst = pyemu.Pst(os.path.join("template", "my.pst"))
pst.remove_parameters(os.path.join("template","boring_zone_pars.dat.tpl"), pst_path=".")
pst.write(os.path.join("template", "my_new_with_less_pars.pst")
enforce_bounds()

enforce bounds violation

Note

cheap enforcement of simply bringing violators back in bounds

Example:

pst = pyemu.Pst("pest.pst")
pst.parrep("random.par")
pst.enforce_bounds()
pst.write("pest_rando.pst")
property estimation

check if the control_data.pestmode is set to estimation

Returns:

True if control_data.pestmode is estmation, False otherwise

Return type:

bool

property forecast_names

get the forecast names from the pestpp options (if any). Returns None if no forecasts are named

Returns:

a list of forecast names.

Return type:

[str]

classmethod from_io_files(tpl_files, in_files, ins_files, out_files, pst_filename=None, pst_path=None)

create a Pst instance from model interface files.

Parameters:
  • tpl_files ([str]) – list of template file names

  • in_files ([str]) – list of model input file names (pairs with template files)

  • ins_files ([str]) – list of instruction file names

  • out_files ([str]) – list of model output file names (pairs with instruction files)

  • pst_filename (str) – name of control file to write. If None, no file is written. Default is None

  • pst_path ('str') – the path from the control file to the IO files. For example, if the control will be in the same directory as the IO files, then pst_path should be ‘.’. Default is None, which doesnt do any path manipulation on the I/O file names

Returns:

new control file instance with parameter and observation names found in tpl_files and ins_files, repsectively.

Return type:

Pst

Note

calls pyemu.helpers.pst_from_io_files()

Assigns generic values for parameter info. Tries to use INSCHEK to set somewhat meaningful observation values

all file paths are relatively to where python is running.

Example:

tpl_files = ["my.tpl"]
in_files = ["my.in"]
ins_files = ["my.ins"]
out_files = ["my.out"]
pst = pyemu.Pst.from_io_files(tpl_files,in_files,ins_files,out_files)
pst.control_data.noptmax = 0
pst.write("my.pst)
classmethod from_par_obs_names(par_names=['par1'], obs_names=['obs1'])

construct a shell Pst instance from parameter and observation names

Parameters:
  • par_names ([str]) – list of parameter names. Default is [par1]

  • obs_names ([str]) – list of observation names. Default is [obs1]

Note

While this method works, it does not make template or instruction files. Users are encouraged to use Pst.from_io_files() for more usefulness

Example:

par_names = ["par1","par2"]
obs_names = ["obs1","obs2"]
pst = pyemu.Pst.from_par_obs_names(par_names,obs_names)
get(par_names=None, obs_names=None)

get a new pst object with subset of parameters and/or observations

Parameters:
  • par_names ([str]) – a list of parameter names to have in the new Pst instance. If None, all parameters are in the new Pst instance. Default is None

  • obs_names ([str]) – a list of observation names to have in the new Pst instance. If None, all observations are in teh new Pst instance. Default is None

Returns:

a new Pst instance

Return type:

Pst

Note

passing par_names as None and obs_names as None effectively generates a copy of the current Pst

Does not modify model i/o files - this is just a method for performing pyemu operations

Example:

pst = pyemu.Pst("pest.pst")
new_pst = pst.get(pst.adj_par_names[0],pst.obs_names[:10])
get_adj_pars_at_bounds(frac_tol=0.01)

get list of adjustable parameter at/near bounds

Parameters:

frac_tol ('float`) – fractional tolerance of distance to bound. For upper bound, the value parubnd * (1-frac_tol) is used, lower bound uses parlbnd * (1.0 + frac_tol)

Returns:

  • [`str`]: list of parameters at/near lower bound

  • [`str`]: list of parameters at/near upper bound

Return type:

tuple containing

Example:

pst = pyemu.Pst("pest.pst")
at_lb,at_ub = pst.get_adj_pars_at_bounds()
print("pars at lower bound",at_lb)
get_par_change_limits()

calculate the various parameter change limits used in pest.

Returns:

a copy of Pst.parameter_data with columns for relative and factor change limits

Return type:

pandas.DataFrame

Note

does not yet support absolute parameter change limits!

Works in control file values space (not log transformed space). Also adds columns for effective upper and lower which account for par bounds and the value of parchglim

example:

pst = pyemu.Pst("pest.pst")
df = pst.get_par_change_limits()
print(df.chg_lower)
static get_phi_components(ogroups, res, obs, pi_ogroups=None, pi_df=None)
get_res_stats(nonzero=True)

get some common residual stats by observation group.

Parameters:

nonzero (bool) – calculate stats using only nonzero-weighted observations. This may seem obsvious to most users, but you never know….

Returns:

a dataframe with columns for groups names and indices of statistic name.

Return type:

pd.DataFrame

Note

Stats are derived from the current obsvals, weights and grouping in Pst.observation_data and the modelled values in Pst.res. The key here is ‘current’ because if obsval, weights and/or groupings have changed in Pst.observation_data since the residuals file was generated then the current values for obsval, weight and group are used

the normalized RMSE is normalized against the obsval range (max - min)

Example:

pst = pyemu.Pst("pest.pst")
stats_df = pst.get_res_stats()
print(stats_df.loc["mae",:])
property greater_than_obs_constraints

get the names of the observations that are listed as active (non-zero weight) greater than inequality constraints.

Returns:

names obseravtions that are non-zero weighted greater than constraints (obgnme startsiwth “g_” or “greater”)

Return type:

pandas.Series

Note

Zero-weighted obs are skipped

property greater_than_pi_constraints

get the names of the prior information eqs that are listed as active (non-zero weight) greater than inequality constraints.

Returns:

pandas.Series names of prior information that are non-zero weighted greater than constraints (obgnme startsiwth “g_” or “greater”)

Note

Zero-weighted pi are skipped

property input_files

list of model input file names

Returns:

a list of model input file names, extracted from

Pst.model_input_data.model_file. Returns None if this attribute is None

Return type:

[str]

Note

Use Pst.model_input_data to access the template-input file information for writing/modification

property instruction_files

list of instruction file names :returns:

a list of instruction file names, extracted from
Pst.model_output_data.pest_file. Returns None if this

attribute is None

Return type:

[str]

Note

Use Pst.model_output_data to access the instruction-output file information for writing/modification

property less_than_obs_constraints

get the names of the observations that are listed as active (non-zero weight) less than inequality constraints.

Returns:

names of observations that are non-zero weighted less than constraints (obgnme starts with ‘l_’ or “less”)

Return type:

pandas.Series

Note

Zero-weighted obs are skipped

property less_than_pi_constraints

get the names of the prior information eqs that are listed as active (non-zero weight) less than inequality constraints.

Returns:

names of prior information that are non-zero weighted less than constraints (obgnme starts with “l_” or “less”)

Return type:

pandas.Series

Note

Zero-weighted pi are skipped

load(filename)

entry point load the pest control file.

Parameters:

filename (str) – pst filename

Note

This method is called from the Pst construtor unless the load arg is False.

property nnz_obs

get the number of non-zero weighted observations

Returns:

the number of non-zeros weighted observations

Return type:

int

property nnz_obs_groups
get the observation groups that contain at least one non-zero weighted

observation

Returns:

a list of observation groups that contain at least one non-zero weighted observation

Return type:

[str]

property nnz_obs_names

get the non-zero weight observation names

Returns:

a list of non-zero weighted observation names

Return type:

[str]

property nobs

get the number of observations

Returns:

the number of observations

Return type:

int

property npar

get number of parameters

Returns:

the number of parameters

Return type:

int

property npar_adj

get the number of adjustable parameters (not fixed or tied)

Returns:

the number of adjustable parameters

Return type:

int

property nprior

number of prior information equations

Returns:

the number of prior info equations

Return type:

int

property obs_groups

get the observation groups

Returns:

a list of unique observation groups

Return type:

[str]

property obs_names

get the observation names

Returns:

a list of observation names

Return type:

[str]

observation_data

‘* observation data’ information. Columns are standard PEST variable names

Example:

pst.observation_data.loc[:,"weight"] = 1.0
pst.observation_data.loc[:,"obgnme"] = "obs_group"
Type:

pandas.DataFrame

property output_files

list of model output file names :returns:

a list of model output file names, extracted from

Pst.model_output_data.model_file. Returns None if this attribute is None

Return type:

[str]

Note

Use Pst.model_output_data to access the instruction-output file information for writing/modification

property par_groups

get the parameter groups

Returns:

a list of parameter groups

Return type:

[str]

property par_names

get the parameter names

Returns:

a list of parameter names

Return type:

[str]

parameter_data

‘* parameter data’ information. Columns are standard PEST variable names

Example:

pst.parameter_data.loc[:,"partrans"] = "log"
pst.parameter_data.loc[:,"parubnd"] = 10.0
Type:

pandas.DataFrame

parrep(parfile=None, enforce_bounds=True, real_name=None, noptmax=0, binary_ens_file=False)
replicates the pest parrep util. replaces the parval1 field in the

parameter data section dataframe with values in a PEST parameter file or a single realization from an ensemble parameter csv file

Parameters:
  • parfile (str, optional) – parameter file to use. If None, try to find and use a parameter file that corresponds to the case name. If parfile has extension ‘.par’ a single realization parameter file is used If parfile has extention ‘.csv’ an ensemble parameter file is used which invokes real_name If parfile has extention ‘.jcb’ a binary ensemble parameter file is used which invokes real_name Default is None

  • enforce_bounds (bool, optional) – flag to enforce parameter bounds after parameter values are updated. This is useful because PEST and PEST++ round the parameter values in the par file, which may cause slight bound violations. Default is True

  • real_name (str or int, optional) – name of the ensemble realization to use for updating the parval1 value in the parameter data section dataframe. If None, try using “base”. If “base” not present, use the real_name with smallest index number. Ignored if parfile is of the PEST parameter file format (e.g. not en ensemble)

  • noptmax (int, optional) – Value with which to update the pst.control_data.noptmax value Default is 0.

  • binary_ens_file (bool) – If True, use binary format to load ensemble file, else assume it’s a CSV file

Example:

pst = pyemu.Pst("pest.pst")
pst.parrep("pest.1.base.par")
pst.control_data.noptmax = 0
pst.write("pest_1.pst")
property phi

get the weighted total objective function.

Returns:

sum of squared residuals

Return type:

float

Note

Requires Pst.res (the residuals file) to be available

property phi_components

get the individual components of the total objective function

Returns:

dictionary of observation group, contribution to total phi

Return type:

dict

Note

Requires Pst.res (the residuals file) to be available

property phi_components_normalized
get the individual components of the total objective function

normalized to the total PHI being 1.0

Returns:

dictionary of observation group, normalized contribution to total phi

Return type:

dict

Note

Requires Pst.res (the residuals file) to be available

plot(kind=None, **kwargs)

method to plot various parts of the control. This is sweet as!

Parameters:
  • kind (str) – options are ‘prior’ (prior parameter histograms, ‘1to1’ (line of equality and sim vs res), ‘obs_v_sim’ (time series using datetime suffix), ‘phi_pie’ (pie chart of phi components)

  • kwargs (dict) – optional args for plots that are passed to pyemu plot helpers and ultimately to matplotlib

Note

Depending on ‘kind’ argument, a multipage pdf is written

Example:

pst = pyemu.Pst("my.pst")
pst.plot(kind="1to1") # requires Pst.res
pst.plot(kind="prior")
pst.plot(kind="phi_pie")
property prior_groups

get the prior info groups

Returns:

a list of prior information groups

Return type:

[str]

prior_information

‘* prior information’ data. Columns are standard PEST variable names

Type:

pandas.DataFrame

property prior_names

get the prior information names

Returns:

a list of prior information names

Return type:

[str]

process_output_files(pst_path='.')

processing the model output files using the instruction files and existing model output files.

Parameters:

pst_path (str) – relative path from where python is running to where the control file, instruction files and model output files are located. Default is “.” (current python directory)

Returns:

model output values

Return type:

pandas.Series

Note

requires a complete set of model input files at relative path from where python is running to pst_path

Example:

pst = pyemu.Pst("pest.pst")
obsvals = pst.process_output_files()
print(obsvals)
proportional_weights(fraction_stdev=1.0, wmax=100.0, leave_zero=True)

setup weights inversely proportional to the observation value

Parameters:
  • fraction_stdev (float, optional) – the fraction portion of the observation val to treat as the standard deviation. set to 1.0 for inversely proportional. Default is 1.0

  • wmax (float, optional) – maximum weight to allow. Default is 100.0

  • leave_zero (bool, optional) – flag to leave existing zero weights. Default is True

Example:

pst = pyemu.Pst("pest.pst")
# set the weights of the observations to 20% of the observed value
pst.proportional_weights(fraction_stdev=0.2,wmax=10)
pst.write("pest_propo.pst")
rectify_pgroups()

synchronize parameter groups section with the parameter data section

Note

This method is called during Pst.write() to make sure all parameter groups named in * parameter data are included. This is so users don’t have to manually keep this section up. This method can also be called during control file modifications to see what parameter groups are present and prepare for modifying the default values in the * parameter group section

Example:

pst = pyemu.Pst("my.pst")
pst.parameter_data.loc["par1","pargp"] = "new_group"
pst.rectify_groups()
pst.parameter_groups.loc["new_group","derinc"] = 1.0
rectify_pi()

rectify the prior information equation with the current state of the parameter_data dataframe.

Note

Equations that list fixed, tied or missing parameters are removed completely even if adjustable parameters are also listed in the equation. This method is called during Pst.write()

reg_data

‘* regularization’ section information. Access with standard PEST variable names.

Example:

pst.reg_data.phimlim = 1.00 #yeah right!
Type:

pyemu.pst.pst_controldata.RegData

rename_observations(name_dict, pst_path='.', insmap=None)

rename observations in the control and instruction files

Parameters:
  • name_dict (dict) – mapping of current to new names.

  • pst_path (str) – the path to the control file from where python is running. Default is “.” (python is running in the same directory as the control file)

Note

This does a lot of string compare, so its gonna be slow as…

Example:

pst = pyemu.Pst(os.path.join("template","pest.pst"))
name_dict = {"obs1":"obs1_better_name"}
pst.rename_observations(name_dict,pst_path="template")
rename_parameters(name_dict, pst_path='.', tplmap=None)

rename parameters in the control and template files

Parameters:
  • name_dict (dict) – mapping of current to new names.

  • pst_path (str) – the path to the control file from where python is running. Default is “.” (python is running in the same directory as the control file)

Note

no attempt is made to maintain the length of the marker strings in the template files, so if your model is sensitive to changes in spacing in the template file(s), this is not a method for you

This does a lot of string compare, so its gonna be slow as…

Example:

pst = pyemu.Pst(os.path.join("template","pest.pst"))
name_dict = {"par1":"par1_better_name"}
pst.rename_parameters(name_dict,pst_path="template")
property res

get the residuals dataframe attribute

Returns:

a dataframe containing the residuals information.

Return type:

pandas.DataFrame

Note

if the Pst.__res attribute has not been loaded, this call loads the res dataframe from a file

Example:

# print the observed and simulated values for non-zero weighted obs
print(pst.res.loc[pst.nnz_obs_names,["modelled","measured"]])
sanity_checks(forgive=False)

some basic check for strangeness

Parameters:

forgive (bool) – flag to forgive (warn) for issues. Default is False

Note

checks for duplicate names, atleast 1 adjustable parameter and at least 1 non-zero-weighted observation

Not nearly as comprehensive as pestchek

Example:

pst = pyemu.Pst("pest.pst")
pst.sanity_checks()
set_res(res)

reset the private Pst.res attribute.

Parameters:

res – (pandas.DataFrame or str): something to use as Pst.res attribute. If res is str, a dataframe is read from file res

svd_data

‘* singular value decomposition’ section information. Access with standard PEST variable names

Example:

pst.svd_data.maxsing = 100
Type:

pyemu.pst.pst_controldata.SvdData

property template_files

list of template file names

Returns:

a list of template file names, extracted from

Pst.model_input_data.pest_file. Returns None if this attribute is None

Return type:

[str]

Note

Use Pst.model_input_data to access the template-input file information for writing/modification

property tied

list of tied parameter names

Returns:

a dataframe of tied parameter information. Columns of tied are parnme and partied. Returns None if no tied parameters are found.

Return type:

pandas.DataFrame

try_parse_name_metadata()

try to add meta data columns to parameter and observation data based on item names. Used with the PstFrom process.

Note

metadata is identified in key-value pairs that are separated by a colon. each key-value pair is separated from others by underscore

This works with PstFrom style long names

This method is called programmtically during Pst.load()

write(new_filename, version=None, check_interface=False)

main entry point to write a pest control file.

Parameters:
  • new_filename (str) – name of the new pest control file

  • version (int) – flag for which version of control file to write (must be 1 or 2). if None, uses the number of pars to decide: if number of pars iis greater than 10,000, version 2 is used.

  • check_interface (bool) – flag to check the control file par and obs names against the names found in the template and instruction files. Default is False

Example:

pst = pyemu.Pst("my.pst")
pst.parrep("my.par")
pst.write(my_new.pst")
#write a version 2 control file
pst.write("my_new_v2.pst",version=2)
write_input_files(pst_path='.')

writes model input files using template files and current parval1 values.

Parameters:

pst_path (str) – the path to where control file and template files reside. Default is ‘.’

Note

adds “parval1_trans” column to Pst.parameter_data that includes the effect of scale and offset

Example:

pst = pyemu.Pst("my.pst")

# load final parameter values
pst.parrep("my.par")

# write new model input files with final parameter values
pst.write_input_files()
write_obs_summary_table(filename=None, group_names=None)
write a stand alone observation summary latex table or Excel shet
filename (str): filename. If None, use <case>.par.tex to write as LaTeX. If filename extention is ‘.xls’ or ‘.xlsx’,

tries to write as an Excel file. If filename is “none”, no table is written Default is None

Parameters:
  • filename (str) – filename. If filename is “none”, no table is written. If None, use <case>.obs.tex. If filename extention is ‘.xls’ or ‘.xlsx’, tries to write as an Excel file. Default is None

  • group_names (dict) – obs group names : table names. For example {“hds”:”simulated groundwater level”}. Default is None

Returns:

the summary observation group dataframe

Return type:

pandas.DataFrame

Example:

pst = pyemu.Pst("my.pst")
pst.write_obs_summary_table(filename="obs.tex")
write_par_summary_table(filename=None, group_names=None, sigma_range=4.0, report_in_linear_space=False)

write a stand alone parameter summary latex table or Excel sheet

Parameters:
  • filename (str) – filename. If None, use <case>.par.tex to write as LaTeX. If filename extention is ‘.xls’ or ‘.xlsx’, tries to write as an Excel file. If filename is “none”, no table is written Default is None

  • group_names (dict) – par group names : table names. For example {“w0”:”well stress period 1”}. Default is None

  • sigma_range (float) – number of standard deviations represented by parameter bounds. Default is 4.0, implying 95% confidence bounds

  • report_in_linear_space (bool) – flag, if True, that reports all logtransformed values in linear space. This renders standard deviation meaningless, so that column is skipped

Returns:

the summary parameter group dataframe

Return type:

pandas.DataFrame

Example:

pst = pyemu.Pst("my.pst")
pst.write_par_summary_table(filename="par.tex")
property zero_weight_obs_names

get the zero-weighted observation names

Returns:

a list of zero-weighted observation names

Return type:

[str]

class pyemu.Schur(jco, **kwargs)

Bases: LinearAnalysis

FOSM-based uncertainty and data-worth analysis

Parameters:
  • jco (varies, optional) – something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.

  • pst (varies, optional) – something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)

  • parcov (varies, optional) – prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance

  • obscov (varies, optional) – observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the noise covariance matrix is constructed from the obsevation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance

  • forecasts (varies, optional) – forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection. Note if the PEST++ option “++forecasts()” is set in the pest control file (under the pyemu.Pst.pestpp_options dictionary), then there is no need to pass this argument (unless you want to analyze different forecasts) of pyemu.Matrix or numpy.ndarray.

  • ref_var (float, optional) – reference variance. Default is 1.0

  • verbose (bool) – controls screen output. If str, a filename is assumed and and log file is written.

  • sigma_range (float, optional) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.

  • scale_offset (bool, optional) – flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.

Note

This class is the primary entry point for FOSM-based uncertainty and dataworth analyses

This class replicates and extends the behavior of the PEST PREDUNC utilities.

Example:

#assumes "my.pst" exists
sc = pyemu.Schur(jco="my.jco",forecasts=["fore1","fore2"])
print(sc.get_forecast_summary())
print(sc.get_parameter_contribution())
get_added_obs_group_importance(reset_zero_weight=1.0)
A dataworth method to analyze the posterior uncertainty as a result of gaining

existing observations, tested by observation groups

Parameters:

reset_zero_weight (float, optional) – a flag to reset observations with zero weight in obslist_dict If reset_zero_weights passed as 0.0, no weights adjustments are made. Default is 1.0.

Returns:

A dataframe with index of observation group names and columns of forecast names. The values in the dataframe are the posterior variances of the forecasts resulting from gaining the information contained in each observation group. One row in the dataframe is labeled “base” - this is posterior forecast variance resulting from the notional calibration with the non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should either not change or decrease as a result of gaining new observations. The magnitude of the decrease represents the worth of the potential new observation(s) in each group being tested.

Return type:

pandas.DataFrame

Note

Observations in Schur.pst with zero weights are not included in the analysis unless reset_zero_weight is a float greater than zero. In most cases, users will want to reset zero-weighted observations as part dataworth testing process.

Example:

sc = pyemu.Schur("my.jco")
df = sc.get_added_obs_group_importance(reset_zero_weight=1.0)
get_added_obs_importance(obslist_dict=None, base_obslist=None, reset_zero_weight=1.0)
A dataworth method to analyze the posterior uncertainty as a result of gathering

some additional observations

Parameters:
  • obslist_dict (dict, optional) – a nested dictionary-list of groups of observations that are to be treated as gained/collected. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None

  • base_obslist ([str], optional) – observation names to treat as the “existing” observations. The values of obslist_dict will be added to this list during each test. If None, then the values in each obslist_dict entry will be treated as the entire calibration dataset. That is, there are no existing observations. Default is None. Standard practice would be to pass this argument as pyemu.Schur.pst.nnz_obs_names so that existing, non-zero-weighted observations are accounted for in evaluating the worth of new yet-to-be-collected observations.

  • reset_zero_weight (float, optional) – a flag to reset observations with zero weight in obslist_dict If reset_zero_weights passed as 0.0, no weights adjustments are made. Default is 1.0.

Returns:

a dataframe with row labels (index) of obslist_dict.keys() and columns of forecast names. The values in the dataframe are the posterior variance of the forecasts resulting from notional inversion using the observations in obslist_dict[key value] plus the observations in base_obslist (if any). One row in the dataframe is labeled “base” - this is posterior forecast variance resulting from the notional calibration with the observations in base_obslist (if base_obslist is None, then the “base” row is the prior forecast variance). Conceptually, the forecast variance should either not change or decrease as a result of gaining additional observations. The magnitude of the decrease represents the worth of the potential new observation(s) being tested.

Return type:

pandas.DataFrame

Note

Observations listed in base_obslist is required to only include observations with weight not equal to zero. If zero-weighted observations are in base_obslist an exception will be thrown. In most cases, users will want to reset zero-weighted observations as part dataworth testing process. If reset_zero_weights == 0, no weights adjustments will be made - this is most appropriate if different weights are assigned to the added observation values in Schur.pst

Example:

sc = pyemu.Schur("my.jco")
obslist_dict = {"hds":["head1","head2"],"flux":["flux1","flux2"]}
df = sc.get_added_obs_importance(obslist_dict=obslist_dict,
                                 base_obslist=sc.pst.nnz_obs_names,
                                 reset_zero_weight=1.0)
get_conditional_instance(parameter_names)

get a new pyemu.Schur instance that includes conditional update from some parameters becoming known perfectly

Parameters:

parameter_names ([str]) – list of parameters that are to be treated as notionally perfectly known

Returns:

a new Schur instance conditional on perfect knowledge of some parameters. The new instance has an updated parcov that is less the names listed in parameter_names.

Return type:

pyemu.Schur

Note

This method is primarily for use by the LinearAnalysis.get_parameter_contribution() dataworth method.

get_forecast_summary()

summary of the FOSM-based forecast uncertainty (variance) estimate(s)

Returns:

dataframe of prior,posterior variances and percent uncertainty reduction of each forecast (e.g. prediction)

Return type:

pandas.DataFrame

Note

This is the primary entry point for accessing forecast uncertainty estimates “precent_reduction” column in dataframe is calculated as 100.0 * (1.0 - (posterior variance / prior variance)

Example:

sc = pyemu.Schur(jco="my.jcb",forecasts=["fore1","fore2"])
df = sc.get_parameter_summary()
df.loc[:,["prior","posterior"]].plot(kind="bar")
plt.show()
df.percent_reduction.plot(kind="bar")
plt.show()
get_obs_group_dict()

get a dictionary of observations grouped by observation group name

Returns:

a dictionary of observations grouped by observation group name. Useful for dataworth processing in pyemu.Schur

Return type:

dict

Note

only includes observations that are listed in Schur.jco.row_names

Example:

sc = pyemu.Schur("my.jco")
obsgrp_dict = sc.get_obs_group_dict()
df = sc.get_removed_obs_importance(obsgrp_dict=obsgrp_dict)
get_par_contribution(parlist_dict=None, include_prior_results=False)

A dataworth method to get a dataframe the prior and posterior uncertainty reduction as a result of some parameter becoming perfectly known

Parameters:
  • parlist_dict – (dict, optional): a nested dictionary-list of groups of parameters that are to be treated as perfectly known. key values become row labels in returned dataframe. If None, each adjustable parameter is sequentially treated as known and the returned dataframe has row labels for each adjustable parameter

  • include_prior_results (bool, optional) – flag to return a multi-indexed dataframe with both conditional prior and posterior forecast uncertainty estimates. This is because the notional learning about parameters potentially effects both the prior and posterior forecast uncertainty estimates. If False, only posterior results are returned. Default is False

Returns:

a dataframe that summarizes the parameter contribution dataworth analysis. The dataframe has index (row labels) of the keys in parlist_dict and a column labels of forecast names. The values in the dataframe are the posterior variance of the forecast conditional on perfect knowledge of the parameters in the values of parlist_dict. One row in the dataframe will be labeled base - this is the forecast uncertainty estimates that include the effects of all adjustable parameters. Percent decreases in forecast uncertainty can be calculated by differencing all rows against the “base” row. Varies depending on include_prior_results.

Return type:

pandas.DataFrame

Note

This is the primary dataworth method for assessing the contribution of one or more parameters to forecast uncertainty.

Example:

sc = pyemu.Schur(jco="my.jco")
parlist_dict = {"hk":["hk1","hk2"],"rech"["rech1","rech2"]}
df = sc.get_par_contribution(parlist_dict=parlist_dict)
get_par_group_contribution(include_prior_results=False)

A dataworth method to get the forecast uncertainty contribution from each parameter group

Parameters:

include_prior_results (bool, optional) – flag to return a multi-indexed dataframe with both conditional prior and posterior forecast uncertainty estimates. This is because the notional learning about parameters potentially effects both the prior and posterior forecast uncertainty estimates. If False, only posterior results are returned. Default is False

Returns:

a dataframe that summarizes the parameter contribution analysis. The dataframe has index (row labels) that are the parameter group names and a column labels of forecast names. The values in the dataframe are the posterior variance of the forecast conditional on perfect knowledge of the adjustable parameters in each parameter group. One row is labelled “base” - this is the variance of the forecasts that includes the effects of all adjustable parameters. Varies depending on include_prior_results.

Return type:

pandas.DataFrame

Note

This method is just a thin wrapper around get_contribution_dataframe() - this method automatically constructs the parlist_dict argument where the keys are the group names and the values are the adjustable parameters in the groups

Example:

sc = pyemu.Schur(jco="my.jco")
df = sc.get_par_group_contribution()
get_parameter_summary()

summary of the FOSM-based parameter uncertainty (variance) estimate(s)

Returns:

dataframe of prior,posterior variances and percent uncertainty reduction of each parameter

Return type:

pandas.DataFrame

Note

This is the primary entry point for accessing parameter uncertainty estimates

The “Prior” column in dataframe is the diagonal of LinearAnalysis.parcov “precent_reduction” column in dataframe is calculated as 100.0 * (1.0 - (posterior variance / prior variance)

Example:

sc = pyemu.Schur(jco="my.jcb",forecasts=["fore1","fore2"])
df = sc.get_parameter_summary()
df.loc[:,["prior","posterior"]].plot(kind="bar")
plt.show()
df.percent_reduction.plot(kind="bar")
plt.show()
get_removed_obs_group_importance(reset_zero_weight=None)
A dataworth method to analyze the posterior uncertainty as a result of losing

existing observations, tested by observation groups

Parameters:

DEPRECATED (reset_zero_weight) –

Returns:

A dataframe with index of observation group names and columns of forecast names. The values in the dataframe are the posterior variances of the forecasts resulting from losing the information contained in each observation group. One row in the dataframe is labeled “base” - this is posterior forecast variance resulting from the notional calibration with the non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should either not change or increase as a result of losing existing observations. The magnitude of the increase represents the worth of the existing observation(s) in each group being tested.

Return type:

pandas.DataFrame

Example:

sc = pyemu.Schur("my.jco")
df = sc.get_removed_obs_group_importance()
get_removed_obs_importance(obslist_dict=None, reset_zero_weight=None)
A dataworth method to analyze the posterior uncertainty as a result of losing

some existing observations

Parameters:
  • obslist_dict (dict, optional) – a nested dictionary-list of groups of observations that are to be treated as lost. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None

  • DEPRECATED (reset_zero_weight) –

Returns:

A dataframe with index of obslist_dict.keys() and columns of forecast names. The values in the dataframe are the posterior variances of the forecasts resulting from losing the information contained in obslist_dict[key value]. One row in the dataframe is labeled “base” - this is posterior forecast variance resulting from the notional calibration with the non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should either not change or increase as a result of losing existing observations. The magnitude of the increase represents the worth of the existing observation(s) being tested.

Note: All observations that may be evaluated as removed must have non-zero weight

Return type:

pandas.DataFrame

Example:

sc = pyemu.Schur("my.jco")
df = sc.get_removed_obs_importance()
next_most_important_added_obs(forecast=None, niter=3, obslist_dict=None, base_obslist=None, reset_zero_weight=1.0)

find the most important observation(s) by sequentially evaluating the importance of the observations in obslist_dict.

Parameters:
  • forecast (str, optional) – name of the forecast to use in the ranking process. If more than one forecast has been listed, this argument is required. This is because the data worth must be ranked with respect to the variance reduction for a single forecast

  • niter (int, optional) – number of sequential dataworth testing iterations. Default is 3

  • obslist_dict (dict, optional) – a nested dictionary-list of groups of observations that are to be treated as gained/collected. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None

  • base_obslist ([str], optional) – observation names to treat as the “existing” observations. The values of obslist_dict will be added to this list during each test. If None, then the values in each obslist_dict entry will be treated as the entire calibration dataset. That is, there are no existing observations. Default is None. Standard practice would be to pass this argument as pyemu.Schur.pst.nnz_obs_names so that existing, non-zero-weighted observations are accounted for in evaluating the worth of new yet-to-be-collected observations.

  • reset_zero_weight (float, optional) – a flag to reset observations with zero weight in obslist_dict If reset_zero_weights passed as 0.0, no weights adjustments are made. Default is 1.0.

Returns:

a dataFrame with columns of obslist_dict key for each iteration the yields the largest variance reduction for the named forecast. Columns are forecast variance percent reduction for each iteration (percent reduction compared to initial “base” case with all non-zero weighted observations included in the notional calibration)

Return type:

pandas.DataFrame

Note

The most important observations from each iteration is added to base_obslist and removed obslist_dict for the next iteration. In this way, the added observation importance values include the conditional information from the last iteration.

Example:

sc = pyemu.Schur(jco="my.jco")
df = sc.next_most_important_added_obs(forecast="fore1",base_obslist=sc.pst.nnz_obs_names)
next_most_par_contribution(niter=3, forecast=None, parlist_dict=None)

find the parameter(s) contributing most to posterior forecast by sequentially evaluating the contribution of parameters in parlist_dict.

Parameters:
  • forecast (str, optional) – name of the forecast to use in the ranking process. If more than one forecast has been listed, this argument is required. This is because the data worth must be ranked with respect to the variance reduction for a single forecast

  • niter (int, optional) – number of sequential dataworth testing iterations. Default is 3

  • parlist_dict (dict, optional) – dict a nested dictionary-list of groups of parameters that are to be treated as perfectly known. key values become row labels in dataframe

  • parlist_dict – a nested dictionary-list of groups of parameters that are to be treated as perfectly known (zero uncertainty). key values become row labels in returned dataframe. If None, then every adustable parameter is tested sequentially. Default is None. Conceptually, the forecast variance should either not change or decrease as a result of knowing parameter perfectly. The magnitude of the decrease represents the worth of gathering information about the parameter(s) being tested.

Note

The largest contributing parameters from each iteration are treated as known perfectly for the remaining iterations. In this way, the next iteration seeks the next most influential group of parameters.

Returns:

a dataframe with index of iteration number and columns of parlist_dict.keys(). The values are the results of the knowing each parlist_dict entry expressed as posterior variance reduction

Return type:

pandas.DataFrame

property posterior_forecast

posterior forecast (e.g. prediction) variance(s)

Returns:

dictionary of forecast names and FOSM-estimated posterior variances

Return type:

dict

Note

Sames as LinearAnalysis.posterior_prediction

See Schur.get_forecast_summary() for a dataframe-based container of prior and posterior variances

property posterior_parameter

posterior parameter covariance matrix.

Returns:

the posterior parameter covariance matrix

Return type:

pyemu.Cov

Example:

sc = pyemu.Schur(jco="my.jcb")
post_cov = sc.posterior_parameter
post_cov.to_ascii("post.cov")
property posterior_prediction

posterior prediction (e.g. forecast) variance estimate(s)

Returns:

dictionary of forecast names and FOSM-estimated posterior variances

Return type:

dict

Note:

sames as LinearAnalysis.posterior_forecast

See Schur.get_forecast_summary() for a dataframe-based container of prior and posterior variances