pyemu
pyEMU: python modules for Environmental Model Uncertainty analyses. These modules are designed to work directly and seamlessly with PEST and PEST++ model independent interface. pyEMU can also be used to setup this interface.
Several forms of uncertainty analyses are support including FOSM-based analyses (pyemu.Schur and pyemu.ErrVar), data worth analyses and high-dimensional ensemble generation.
Subpackages
Submodules
Package Contents
Classes
based class for handling ensembles of numeric values |
|
Observation noise ensemble in the PEST(++) realm |
|
Parameter ensembles in the PEST(++) realm |
|
FOSM-based error variance analysis |
|
The base/parent class for linear analysis. |
|
Diagonal and/or dense Covariance matrices |
|
a thin wrapper class to get more intuitive attribute names. Functions |
|
Easy linear algebra in the PEST(++) realm |
|
All things PEST(++) control file |
|
FOSM-based uncertainty and data-worth analysis |
- class pyemu.Ensemble(pst, df, istransformed=False)
Bases:
object
based class for handling ensembles of numeric values
- Parameters:
pst (pyemu.Pst) – a control file instance
df (pandas.DataFrame) – a pandas dataframe. Columns should be parameter/observation names. Index is treated as realization names
istransformed (bool) – flag to indicate parameter values are in log space. Not used for ObservationEnsemble
Example:
pst = pyemu.Pst("my.pst") pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)
- property istransformed
the parameter transformation status
- Returns:
flag to indicate whether or not the ParameterEnsemble is transformed with respect to log_{10}. Not used for (and has no effect on) ObservationEnsemble.
- Return type:
bool
Note
parameter transformation status is only related to log_{10} and does not include the effects of scale and/or offset
- pst
control file instance
- Type:
pyemu.Pst
- __repr__()
Return repr(self).
- __str__()
Return str(self).
- __sub__(other)
- __mul__(other)
- __truediv__(other)
- __add__(other)
- __pow__(pow)
- static reseed()
reset the numpy.random.seed
Note
reseeds using the pyemu.en.SEED global variable
The pyemu.en.SEED value is set as the numpy.random.seed on import, so make sure you know what you are doing if you call this method…
- copy()
get a copy of Ensemble
- Returns:
copy of this Ensemble
- Return type:
Ensemble
Note
copies both Ensemble.pst and Ensemble._df: can be expensive
- transform()
transform parameters with respect to partrans value.
Note
operates in place (None is returned).
Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset
Ensemble.transform() is only provided for inheritance purposes. It only changes the `Ensemble._transformed flag
- back_transform()
back transform parameters with respect to partrans value.
Note
operates in place (None is returned).
Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset
Ensemble.back_transform() is only provided for inheritance purposes. It only changes the `Ensemble._transformed flag
- __getattr__(item)
- plot(bins=10, facecolor='0.5', plot_cols=None, filename='ensemble.pdf', func_dict=None, **kwargs)
plot ensemble histograms to multipage pdf
- Parameters:
bins (int) – number of bins for the histograms
facecolor (str) – matplotlib color (e.g. r,`g`, etc)
plot_cols ([str]) – list of subset of ensemble columns to plot. If None, all are plotted. Default is None
filename (str) – multipage pdf filename. Default is “ensemble.pdf”
func_dict (dict) – a dict of functions to apply to specific columns. For example: {“par1”: np.log10}
**kwargs (dict) – addkeyword args to pass to pyemu.plot_utils.ensemble_helper()
Example:
pst = pyemu.Pst("my.pst") pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst) pe.transform() # plot log space (if needed) pe.plot(bins=30)
- classmethod from_binary(pst, filename)
create an Ensemble from a PEST-style binary file
- Parameters:
pst (pyemu.Pst) – a control file instance
filename (str) – filename containing binary ensemble
- Returns:
the ensemble loaded from the binary file
- Return type:
Ensemble
Example:
pst = pyemu.Pst("my.pst") oe = pyemu.ObservationEnsemble.from_binary("obs.jcb")
- classmethod from_csv(pst, filename, *args, **kwargs)
create an Ensemble from a CSV file
- Parameters:
pst (pyemu.Pst) – a control file instance
filename (str) – filename containing CSV ensemble
([object] (*args) – positional arguments to pass to pandas.read_csv().
({str (**kwargs) – object}): keyword arguments to pass to pandas.read_csv().
- Returns:
Ensemble
Note
uses pandas.read_csv() to load numeric values from CSV file
Example:
pst = pyemu.Pst("my.pst") oe = pyemu.ObservationEnsemble.from_csv("obs.csv")
- to_csv(filename, *args, **kwargs)
write Ensemble to a CSV file
- Parameters:
filename (str) – file to write
([object] (*args) – positional arguments to pass to pandas.DataFrame.to_csv().
({str (**kwargs) – object}): keyword arguments to pass to pandas.DataFrame.to_csv().
Example:
pst = pyemu.Pst("my.pst") oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst) oe.to_csv("obs.csv")
Note
back transforms ParameterEnsemble before writing so that values are in arithmetic space
- to_binary(filename)
write Ensemble to a PEST-style binary file
- Parameters:
filename (str) – file to write
Example:
pst = pyemu.Pst("my.pst") oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst) oe.to_binary("obs.jcb")
Note
back transforms ParameterEnsemble before writing so that values are in arithmetic space
- to_dense(filename)
write Ensemble to a dense-format binary file
- Parameters:
filename (str) – file to write
Example:
pst = pyemu.Pst("my.pst") oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst) oe.to_dense("obs.bin")
Note
back transforms ParameterEnsemble before writing so that values are in arithmatic space
- classmethod from_dataframe(pst, df, istransformed=False)
- static _gaussian_draw(cov, mean_values, num_reals, grouper=None, fill=True, factor='eigen')
- static _get_svd_projection_matrix(x, maxsing=None, eigthresh=1e-07)
- static _get_eigen_projection_matrix(x)
- get_deviations(center_on=None)
get the deviations of the realizations around a certain point in ensemble space
- Parameters:
center_on (str, optional) – a realization name to use as the centering point in ensemble space. If None, the mean vector is treated as the centering point. Default is None
- Returns:
an ensemble of deviations around the centering point
- Return type:
Ensemble
Note
deviations are the Euclidean distances from the center_on value to realized values for each column
center_on=None yields the classic ensemble smoother/ensemble Kalman filter deviations from the mean vector
Deviations respect log-transformation status.
Example:
pst = pyemu.Pst("my.pst") oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst) oe.add_base() oe_dev = oe.get_deviations(center_on="base") oe.to_csv("obs_base_devs.csv")
- as_pyemu_matrix(typ=None)
get a pyemu.Matrix instance of Ensemble
- Parameters:
typ (pyemu.Matrix or pyemu.Cov) – the type of matrix to return. Default is pyemu.Matrix
- Returns:
a matrix instance
- Return type:
pyemu.Matrix
Example:
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst=pst,num_reals=100) dev_mat = oe.get_deviations().as_pyemu_matrix(typ=pyemu.Cov) obscov = dev_mat.T * dev_mat
- covariance_matrix(localizer=None, center_on=None)
get a empirical covariance matrix implied by the correlations between realizations
- Parameters:
localizer (pyemu.Matrix, optional) – a matrix to localize covariates in the resulting covariance matrix. Default is None
center_on (str, optional) – a realization name to use as the centering point in ensemble space. If None, the mean vector is treated as the centering point. Default is None
- Returns:
the empirical (and optionally localized) covariance matrix
- Return type:
pyemu.Cov
- dropna(*args, **kwargs)
override of pandas.DataFrame.dropna()
- Parameters:
([object] (*args) – positional arguments to pass to pandas.DataFrame.dropna().
({str (**kwargs) – object}): keyword arguments to pass to pandas.DataFrame.dropna().
- class pyemu.ObservationEnsemble(pst, df, istransformed=False)
Bases:
Ensemble
Observation noise ensemble in the PEST(++) realm
- Parameters:
pst (pyemu.Pst) – a control file instance
df (pandas.DataFrame) – a pandas dataframe. Columns should be observation names. Index is treated as realization names
istransformed (bool) – flag to indicate parameter values are in log space. Not used for ObservationEnsemble
Example:
pst = pyemu.Pst("my.pst") oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
- property phi_vector
vector of L2 norm (phi) for the realizations (rows) of Ensemble.
- Returns:
series of realization name (Ensemble.index) and phi values
- Return type:
pandas.Series
Note
The ObservationEnsemble.pst.weights can be updated prior to calling this method to evaluate new weighting strategies
- property nonzero
get a new ObservationEnsemble of just non-zero weighted observations
- Returns:
non-zero weighted observation ensemble.
- Return type:
ObservationEnsemble
Note
The pst attribute of the returned ObservationEnsemble also only includes non-zero weighted observations (and is therefore not valid for running with PEST or PEST++)
- classmethod from_gaussian_draw(pst, cov=None, num_reals=100, by_groups=True, fill=False, factor='eigen')
generate an ObservationEnsemble from a (multivariate) gaussian distribution
- Parameters:
pst (pyemu.Pst) – a control file instance.
cov (pyemu.Cov) – a covariance matrix describing the second moment of the gaussian distribution. If None, cov is generated from the non-zero-weighted observation weights in pst. Only observations listed in cov are sampled. Other observations are assigned the obsval value from pst.
num_reals (int) – number of stochastic realizations to generate. Default is 100
by_groups (bool) – flag to generate realzations be observation group. This assumes no correlation (covariates) between observation groups.
fill (bool) – flag to fill in zero-weighted observations with control file values. Default is False.
factor (str) – how to factorize cov to form the projectin matrix. Can be “eigen” or “svd”. The “eigen” option is default and is faster. But for (nearly) singular cov matrices (such as those generated empirically from ensembles), “svd” is the only way. Ignored for diagonal cov.
- Returns:
the realized ObservationEnsemble instance
- Return type:
ObservationEnsemble
Note
Only observations named in cov are sampled. Additional, cov is processed prior to sampling to only include non-zero-weighted observations depending on the value of fill. So users must take care to make sure observations have been assigned non-zero weights even if cov is being passed
The default cov is generated from pyemu.Cov.from_observation_data, which assumes observation noise standard deviations are the inverse of the weights listed in pst
Example:
pst = pyemu.Pst("my.pst") # the easiest way - just relying on weights in pst oe1 = pyemu.ObservationEnsemble.from_gaussian_draw(pst) # generate the cov explicitly cov = pyemu.Cov.from_observation_data(pst) oe2 = pyemu.ObservationEnsemble.from_gaussian_draw(pst,cov=cov) # give all but one observation zero weight. This will # result in an oe with only one randomly sampled observation noise # vector since the cov is processed to remove any zero-weighted # observations before sampling pst.observation_data.loc[pst.nnz_obs_names[1:],"weight] = 0.0 oe3 = pyemu.ObservationEnsemble.from_gaussian_draw(pst,cov=cov)
- get_phi_vector(noise_obs_filename=None, noise_obs_flag=False)
- add_base()
add the control file obsval values as a realization
Note
replaces the last realization with the current ObservationEnsemble.pst.observation_data.obsval values as a new realization named “base”
the PEST++ enemble tools will add this realization also if you dont wanna fool with it here…
- class pyemu.ParameterEnsemble(pst, df, istransformed=False)
Bases:
Ensemble
Parameter ensembles in the PEST(++) realm
- Parameters:
pst (pyemu.Pst) – a control file instance
df (pandas.DataFrame) – a pandas dataframe. Columns should be parameter names. Index is treated as realization names
istransformed (bool) – flag to indicate parameter values are in log space (if partrans is “log” in pst)
Example:
pst = pyemu.Pst("my.pst") pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)
- property adj_names
the names of adjustable parameters in ParameterEnsemble
- Returns:
adjustable parameter names
- Return type:
[str]
- property ubnd
the upper bound vector while respecting current log transform status
- Returns:
(log-transformed) upper parameter bounds listed in ParameterEnsemble.pst.parameter_data.parubnd
- Return type:
pandas.Series
- property lbnd
the lower bound vector while respecting current log transform status
- Returns:
(log-transformed) lower parameter bounds listed in ParameterEnsemble.pst.parameter_data.parlbnd
- Return type:
pandas.Series
- property log_indexer
boolean indexer for log transform
- Returns:
boolean array indicating which parameters are log transformed
- Return type:
numpy.ndarray(bool)
- property fixed_indexer
boolean indexer for non-adjustable parameters
- Returns:
boolean array indicating which parameters have partrans equal to “log” or “fixed”
- Return type:
numpy.ndarray(bool)
- classmethod from_gaussian_draw(pst, cov=None, num_reals=100, by_groups=True, fill=True, factor='eigen')
generate a ParameterEnsemble from a (multivariate) (log) gaussian distribution
- Parameters:
pst (pyemu.Pst) – a control file instance.
cov (pyemu.Cov) – a covariance matrix describing the second moment of the gaussian distribution. If None, cov is generated from the bounds of the adjustable parameters in pst. the (log) width of the bounds is assumed to represent a multiple of the parameter standard deviation (this is the sigma_range argument that can be passed to pyemu.Cov.from_parameter_data).
num_reals (int) – number of stochastic realizations to generate. Default is 100
by_groups (bool) – flag to generate realizations be parameter group. This assumes no correlation (covariates) between parameter groups. For large numbers of parameters, this help prevent memories but is slower.
fill (bool) – flag to fill in fixed and/or tied parameters with control file values. Default is True.
factor (str) – how to factorize cov to form the projection matrix. Can be “eigen” or “svd”. The “eigen” option is default and is faster. But for (nearly) singular cov matrices (such as those generated empirically from ensembles), “svd” is the only way. Ignored for diagonal cov.
- Returns:
the parameter ensemble realized from the gaussian distribution
- Return type:
ParameterEnsemble
Note
Only parameters named in cov are sampled. Missing parameters are assigned values of pst.parameter_data.parval1 along the corresponding columns of ParameterEnsemble according to the value of fill.
The default cov is generated from pyemu.Cov.from_observation_data, which assumes parameter bounds in ParameterEnsemble.pst represent some multiple of parameter standard deviations. Additionally, the default Cov only includes adjustable parameters (partrans not “tied” or “fixed”).
“tied” parameters are not sampled.
Example:
pst = pyemu.Pst("my.pst") # the easiest way - just relying on weights in pst pe1 = pyemu.ParameterEnsemble.from_gaussian_draw(pst) # generate the cov explicitly with a sigma_range cov = pyemu.Cov.from_parameter_data(pst,sigma_range=6) [e2 = pyemu.ParameterEnsemble.from_gaussian_draw(pst,cov=cov)
- classmethod from_triangular_draw(pst, num_reals=100, fill=True)
generate a ParameterEnsemble from a (multivariate) (log) triangular distribution
- Parameters:
pst (pyemu.Pst) – a control file instance
num_reals (int, optional) – number of realizations to generate. Default is 100
fill (bool) – flag to fill in fixed and/or tied parameters with control file values. Default is True.
- Returns:
a parameter ensemble drawn from the multivariate (log) triangular distribution defined by the parameter upper and lower bounds and initial parameter values in pst
- Return type:
ParameterEnsemble
Note
respects transformation status in pst: fixed and tied parameters are not realized, log-transformed parameters are drawn in log space. The returned ParameterEnsemble is back transformed (not in log space)
uses numpy.random.triangular
Example:
pst = pyemu.Pst("my.pst") pe = pyemu.ParameterEnsemble.from_triangular_draw(pst) pe.to_csv("my_tri_pe.csv")
- classmethod from_uniform_draw(pst, num_reals, fill=True)
generate a ParameterEnsemble from a (multivariate) (log) uniform distribution
- Parameters:
pst (pyemu.Pst) – a control file instance
num_reals (int, optional) – number of realizations to generate. Default is 100
fill (bool) – flag to fill in fixed and/or tied parameters with control file values. Default is True.
- Returns:
a parameter ensemble drawn from the multivariate (log) uniform distribution defined by the parameter upper and lower bounds pst
- Return type:
ParameterEnsemble
Note
respects transformation status in pst: fixed and tied parameters are not realized, log-transformed parameters are drawn in log space. The returned ParameterEnsemble is back transformed (not in log space)
uses numpy.random.uniform
Example:
pst = pyemu.Pst("my.pst") pe = pyemu.ParameterEnsemble.from_uniform_draw(pst) pe.to_csv("my_uni_pe.csv")
- classmethod from_mixed_draws(pst, how_dict, default='gaussian', num_reals=100, cov=None, sigma_range=6, enforce_bounds=True, partial=False, fill=True)
generate a ParameterEnsemble using a mixture of distributions. Available distributions include (log) “uniform”, (log) “triangular”, and (log) “gaussian”. log transformation is respected.
- Parameters:
pst (pyemu.Pst) – a control file
how_dict (dict) – a dictionary of parameter name keys and “how” values, where “how” can be “uniform”,”triangular”, or “gaussian”.
default (str) – the default distribution to use for parameter not listed in how_dict. Default is “gaussian”.
num_reals (int) – number of realizations to draw. Default is 100.
cov (pyemu.Cov) – an optional Cov instance to use for drawing from gaussian distribution. If None, and “gaussian” is listed in how_dict (and/or default), then a diagonal covariance matrix is constructed from the parameter bounds in pst (with sigma_range). Default is None.
sigma_range (float) – the number of standard deviations implied by the parameter bounds in the pst. Only used if “gaussian” is in how_dict (and/or default) and cov is None. Default is 6.
enforce_bounds (bool) – flag to enforce parameter bounds in resulting ParameterEnsemble. Only matters if “gaussian” is in values of how_dict. Default is True.
partial (bool) – flag to allow a partial ensemble (not all pars included). If True, parameters not name in how_dict will be sampled using the distribution named as default. Default is False.
fill (bool) – flag to fill in fixed and/or tied parameters with control file values. Default is True.
Example:
pst = pyemu.Pst("pest.pst") # uniform for the fist 10 pars how_dict = {p:"uniform" for p in pst.adj_par_names[:10]} pe = pyemu.ParameterEnsemble(pst,how_dict=how_dict) pe.to_csv("my_mixed_pe.csv")
- classmethod from_parfiles(pst, parfile_names, real_names=None)
create a parameter ensemble from PEST-style parameter value files. Accepts parfiles with less than the parameters in the control (get NaNs in the ensemble) or extra parameters in the parfiles (get dropped)
- Parameters:
pst (pyemu.Pst) – control file instance
parfile_names ([str]) – par file names
real_names (str) – optional list of realization names. If None, a single integer counter is used
- Returns:
parameter ensemble loaded from par files
- Return type:
ParameterEnsemble
- back_transform()
back transform parameters with respect to partrans value.
Note
operates in place (None is returned).
Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset
- transform()
transform parameters with respect to partrans value.
Note
operates in place (None is returned).
Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset
- add_base()
add the control file obsval values as a realization
Note
replaces the last realization with the current ParameterEnsemble.pst.parameter_data.parval1 values as a new realization named “base”
The PEST++ ensemble tools will add this realization also if you dont wanna fool with it here…
- project(projection_matrix, center_on=None, log=None, enforce_bounds='reset')
project the ensemble using the null-space Monte Carlo method
- Parameters:
projection_matrix (pyemu.Matrix) – null-space projection operator.
center_on (str) – the name of the realization to use as the centering point for the null-space differening operation. If center_on is None, the ParameterEnsemble mean vector is used. Default is None
log (pyemu.Logger, optional) – for logging progress
enforce_bounds (str) – parameter bound enforcement option to pass to ParameterEnsemble.enforce(). Valid options are reset, drop, scale or None. Default is reset.
- Returns:
untransformed, null-space projected ensemble.
- Return type:
ParameterEnsemble
Example:
ev = pyemu.ErrVar(jco="my.jco") #assumes my.pst exists pe = pyemu.ParameterEnsemble.from_gaussian_draw(ev.pst) pe_proj = pe.project(ev.get_null_proj(maxsing=25)) pe_proj.to_csv("proj_par.csv")
- enforce(how='reset', bound_tol=0.0)
entry point for bounds enforcement.
- Parameters:
enforce_bounds (str) – can be ‘reset’ to reset offending values or ‘drop’ to drop offending realizations. Default is “reset”
Note
In very high dimensions, the “drop” and “scale” how types will result in either very few realizations or very short realizations.
Example:
pst = pyemu.Pst("my.pst") pe = pyemu.ParameterEnsemble.from_gaussian_draw() pe.enforce(how="scale") pe.to_csv("par.csv")
- _enforce_scale(bound_tol)
- _enforce_drop(bound_tol)
enforce parameter bounds on the ensemble by dropping violating realizations
Note
with a large (realistic) number of parameters, the probability that any one parameter is out of bounds is large, meaning most realization will be dropped.
- _enforce_reset(bound_tol)
enforce parameter bounds on the ensemble by resetting violating vals to bound
- class pyemu.ErrVar(jco, **kwargs)
Bases:
pyemu.la.LinearAnalysis
FOSM-based error variance analysis
- Parameters:
jco (varies, optional) – something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.
pst (varies, optional) – something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)
parcov (varies, optional) – prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance
obscov (varies, optional) – observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the noise covariance matrix is constructed from the obsevation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance
forecasts (varies, optional) – forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection. Note if the PEST++ option “++forecasts()” is set in the pest control file (under the pyemu.Pst.pestpp_options dictionary), then there is no need to pass this argument (unless you want to analyze different forecasts) of pyemu.Matrix or numpy.ndarray.
ref_var (float, optional) – reference variance. Default is 1.0
verbose (bool) – controls screen output. If str, a filename is assumed and and log file is written.
sigma_range (float, optional) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.
scale_offset (bool, optional) – flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.
omitted_parameters ([str]) – list of parameters to treat as “omitted”. Passing this argument activates 3-term error variance analysis.
omitted_parcov (varies) – an argument that can be cast to a parcov for the omitted parameters. If None, omitted_parcov will be formed by extracting a sub-matrix from the LinearAnalsis.parcov attribute.
omitted_predictions (varies) – an argument that can be cast to a “predictions” (e.g. “forecasts”) attribute to form prediction sensitivity vectors with respec to the omitted parameters. If None, these vectors will be extracted from the pyemu.LinearAnalysis.predictions attribute
kl (bool, optional) – flag to perform Karhunen-Loeve scaling on the jacobian before error variance calculations. If True, the pyemu.ErrVar.jco and pyemu.ErrVar.parcov are altered in place. Default is False.
Example:
ev = pyemu.ErrVar(jco="my.jco",omitted_parameters=["wel1","wel2"]) df = ev.get_errvar_dataframe()
- property omitted_predictions
omitted prediction sensitivity vectors
- Returns:
a matrix of prediction sensitivity vectors (column wise) to omitted parameters
- Return type:
pyemu.Matrix
- property omitted_jco
the omitted-parameters jacobian matrix
- Returns:
the jacobian matrix instance of non-zero-weighted observations and omitted parameters
- Return type:
pyemu.Jco
- property omitted_parcov
the prior omitted-parameter covariance matrix
- Returns:
the prior parameter covariance matrix of the omitted parameters
- Return type:
pyemu.Cov
- __load_omitted_predictions()
private: set the omitted_predictions attribute
- __load_omitted_parcov()
private: set the omitted_parcov attribute
- __load_omitted_jco()
private: set the omitted jco attribute
- get_errvar_dataframe(singular_values=None)
primary entry point for error variance analysis.
- Parameters:
singular_values ([int], optional) – a list singular values to test. If None, defaults to range(0,min(nnz_obs,nadj_par) + 1).
- Returns:
a multi-indexed pandas dataframe summarizing each of the error variance terms for each nominated forecast. Rows are the singluar values tested, columns are a multi-index of forecast name and error variance term number (e.g. 1,2 or (optionally) 3).
- Return type:
pandas.DataFrame
Example:
ev = pyemu.ErrVar(jco="my.jco",omitted_parameters=["wel1","wel2"]) df = ev.get_errvar_dataframe()
- get_identifiability_dataframe(singular_value=None, precondition=False)
primary entry point for identifiability analysis
- Parameters:
singular_value (int) – the singular spectrum truncation point. Defaults to minimum of non-zero-weighted observations and adjustable parameters
precondition (bool) – flag to use the preconditioned hessian with the prior parameter covariance matrix (xtqt + sigma_theta^-1). This should be used KL scaling. Default is False.
- Returns:
A pandas dataframe of the right solution-space singular vectors and identifiability (identifiabiity is in the column labeled “ident”)
- Return type:
pandas.DataFrame
Examples:
ev = pyemu.ErrVar(jco="my.jco") df = ev.get_identifiability_dataframe(singular_value=20) df.ident.plot(kind="bar")
- variance_at(singular_value)
- get the error variance of all three error variance terms at a
given singluar value
- Parameters:
singular_value (int) – singular value to test
- Returns:
dictionary of (err var term,prediction_name), variance pairs
- Return type:
dict
- R(singular_value)
get resolution Matrix (V_1 * V_1^T) at a given singular value
Args: singular_value (int): singular value to calculate R at
- Returns:
resolution matrix at singular_value
- Return type:
pyemu.Matrix
- I_minus_R(singular_value)
get I - R at a given singular value
- Parameters:
singular_value (int) – singular value to calculate I - R at
- Returns:
identity matrix minus resolution matrix at singular_value
- Return type:
pyemu.Matrix
- G(singular_value)
get the parameter solution Matrix at a given singular value
- Parameters:
singular_value (int) – singular value to calc G at
- Returns:
parameter solution matrix (V_1 * S_1^(_1) * U_1^T) at singular_value
- Return type:
pyemu.Matrix
- first_forecast(singular_value)
- get the null space term (first term) contribution to forecast (e.g. prediction)
error variance at a given singular value.
- Parameters:
singular_value (int) – singular value to calc first term at
Note
This method is used to construct the error variance dataframe
Just a wrapper around ErrVar.first_forecast
- Returns:
dictionary of (“first”,prediction_names),error variance pairs at singular_value
- Return type:
dict
- first_prediction(singular_value)
- get the null space term (first term) contribution to prediction error variance
at a given singular value.
- Parameters:
singular_value (int) – singular value to calc first term at
Note
This method is used to construct the error variance dataframe
- Returns:
dictionary of (“first”,prediction_names),error variance pairs at singular_value
- Return type:
dict
- first_parameter(singular_value)
- get the null space term (first term) contribution to parameter error variance
at a given singular value
- Parameters:
singular_value (int) – singular value to calc first term at
- Returns:
first term contribution to parameter error variance
- Return type:
pyemu.Cov
- second_forecast(singular_value)
get the solution space contribution to forecast (e.g. “prediction”) error variance at a given singular value
- Parameters:
singular_value (int) – singular value to calc second term at
Note
This method is used to construct error variance dataframe
Just a thin wrapper around ErrVar.second_prediction
- Returns:
dictionary of (“second”,prediction_names), error variance arising from the solution space contribution (y^t * G * obscov * G^T * y)
- Return type:
dict
- second_prediction(singular_value)
get the solution space contribution to predictive error variance at a given singular value
- Parameters:
singular_value (int) – singular value to calc second term at
Note
This method is used to construct error variance dataframe
- Returns: dict: dictionary of (“second”,prediction_names), error variance
arising from the solution space contribution (y^t * G * obscov * G^T * y)
- second_parameter(singular_value)
- get the solution space contribution to parameter error variance
at a given singular value (G * obscov * G^T)
- Parameters:
singular_value (int) – singular value to calc second term at
- Returns:
the second term contribution to parameter error variance (G * obscov * G^T)
- Return type:
pyemu.Cov
- third_forecast(singular_value)
- get the omitted parameter contribution to forecast (prediction) error variance
at a given singular value.
- Parameters:
singular_value (int) – singular value to calc third term at
Note
used to construct error variance dataframe just a thin wrapper around ErrVar.third_prediction()
- Returns:
a dictionary of (“third”,prediction_names),error variance
- Return type:
dict
- third_prediction(singular_value)
- get the omitted parameter contribution to prediction error variance
at a given singular value.
- Parameters:
singular_value (int) – singular value to calc third term at
Note
used to construct error variance dataframe
- Returns:
a dictionary of (“third”,prediction_names),error variance
- Return type:
dict
- third_parameter(singular_value)
- get the omitted parameter contribution to parameter error variance
at a given singular value
- Parameters:
singular_value (int) – singular value to calc third term at
- Returns:
the third term contribution to parameter error variance calculated at singular_value (G * omitted_jco * Sigma_(omitted_pars) * omitted_jco^T * G^T). Returns 0.0 if third term calculations are not being used.
- Return type:
pyemu.Cov
- get_null_proj(maxsing=None, eigthresh=1e-06)
get a null-space projection matrix of XTQX
- Parameters:
maxsing (int, optional) – number of singular components to use (the truncation point). If None, pyemu.Matrx.get_maxsing() is used to determine the truncation point with `eigthresh. Default is None
eigthresh (float, optional) – the ratio of smallest to largest singular value to keep in the range (solution) space of XtQX. Not used if maxsing is not None. Default is 1.0e-6
Note
used for null-space monte carlo operations.
- Returns:
pyemu.Matrix the null-space projection matrix (V2V2^T)
- class pyemu.LinearAnalysis(jco=None, pst=None, parcov=None, obscov=None, predictions=None, ref_var=1.0, verbose=False, resfile=False, forecasts=None, sigma_range=4.0, scale_offset=True, **kwargs)
Bases:
object
The base/parent class for linear analysis.
- Parameters:
jco (varies, optional) – something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.
pst (varies, optional) – something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)
parcov (varies, optional) – prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance
obscov (varies, optional) – observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the noise covariance matrix is constructed from the obsevation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance
forecasts (varies, optional) – forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection of pyemu.Matrix or numpy.ndarray.
ref_var (float, optional) – reference variance. Default is 1.0
verbose (bool) – controls screen output. If str, a filename is assumed and and log file is written.
sigma_range (float, optional) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.
scale_offset (bool, optional) – flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.
Note
Can be used directly, but for prior uncertainty analyses only.
The derived types (pyemu.Schur, pyemu.ErrVar) are for different forms of FOMS-based posterior uncertainty analyses.
This class tries hard to not load items until they are needed; all arguments are optional.
The class makes heavy use of property decorator to encapsulated private attributes
Example:
#assumes "my.pst" exists la = pyemu.LinearAnalysis(jco="my.jco",forecasts=["fore1","fore2"]) print(la.prior_forecast)
- property forecast_names
get the forecast (aka prediction) names
- Returns:
list of forecast names
- Return type:
([str])
- property parcov
get the prior parameter covariance matrix attribute
- Returns:
a reference to the LinearAnalysis.parcov attribute
- Return type:
pyemu.Cov
- property obscov
get the observation noise covariance matrix attribute
- Returns:
a reference to the LinearAnalysis.obscov attribute
- Return type:
pyemu.Cov
- property nnz_obs_names
non-zero-weighted observation names
- Returns:
list of non-zero-weighted observation names
- Return type:
[‘str`]
Note
if LinearAnalysis.pst is None, returns LinearAnalysis.jco.row_names
- property adj_par_names
adjustable parameter names
- Returns:
list of adjustable parameter names
- Return type:
[‘str`]
Note
if LinearAnalysis.pst is None, returns LinearAnalysis.jco.col_names
- property jco
the jacobian matrix attribute
- Returns:
the jacobian matrix attribute
- Return type:
pyemu.Jco
- property predictions
the prediction (aka forecast) sentivity vectors attribute
- Returns:
a matrix of prediction sensitivity vectors (column wise)
- Return type:
pyemu.Matrix
- property predictions_iter
prediction sensitivity vectors iterator
- Returns:
iterator on prediction sensitivity vectors (matrix)
- Return type:
iterator
Note
this is used for processing huge numbers of predictions
- property forecasts_iter
forecast (e.g. prediction) sensitivity vectors iterator
- Returns:
iterator on forecasts (e.g. predictions) sensitivity vectors (matrix)
- Return type:
iterator
Note
This is used for processing huge numbers of predictions
This is a synonym for LinearAnalysis.predictions_iter()
- property forecasts
the forecast sentivity vectors attribute
- Returns:
a matrix of forecast (prediction) sensitivity vectors (column wise)
- Return type:
pyemu.Matrix
- property pst
the pst attribute
- Returns:
the pst attribute
- Return type:
pyemu.Pst
- property fehalf
Karhunen-Loeve scaling matrix attribute.
- Returns:
the Karhunen-Loeve scaling matrix based on the prior parameter covariance matrix
- Return type:
pyemu.Matrix
- property qhalf
square root of the cofactor matrix attribute. Create the attribute if it has not yet been created
- Returns:
square root of the cofactor matrix
- Return type:
pyemu.Matrix
- property qhalfx
half normal matrix attribute.
- Returns:
half normal matrix attribute
- Return type:
pyemu.Matrix
- property xtqx
normal matrix attribute.
- Returns:
normal matrix attribute
- Return type:
pyemu.Matrix
- property mle_covariance
maximum likelihood parameter covariance matrix.
- Returns:
maximum likelihood parameter covariance matrix
- Return type:
pyemu.Matrix
- property prior_parameter
prior parameter covariance matrix
- Returns:
prior parameter covariance matrix
- Return type:
pyemu.Cov
- property prior_forecast
prior forecast (e.g. prediction) variances
- Returns:
a dictionary of forecast name, prior variance pairs
- Return type:
dict
- property mle_parameter_estimate
maximum likelihood parameter estimate.
- Returns:
the maximum likelihood parameter estimates
- Return type:
pandas.Series
- property prior_prediction
prior prediction (e.g. forecast) variances
- Returns:
a dictionary of prediction name, prior variance pairs
- Return type:
dict
- __fromfile(filename, astype=None)
a private method to deduce and load a filename into a matrix object. Uses extension: ‘jco’ or ‘jcb’: binary, ‘mat’,’vec’ or ‘cov’: ASCII, ‘unc’: pest uncertainty file.
- __load_pst()
private method set the pst attribute
- __load_jco()
private method to set the jco attribute from a file or a matrix object
- __load_parcov()
private method to set the parcov attribute from: a pest control file (parameter bounds) a pst object a matrix object an uncert file an ascii matrix file
- __load_obscov()
private method to set the obscov attribute from: a pest control file (observation weights) a pst object a matrix object an uncert file an ascii matrix file
- __load_predictions()
- private method set the predictions attribute from:
mixed list of row names, matrix files and ndarrays a single row name an ascii file
can be none if only interested in parameters.
- apply_karhunen_loeve_scaling()
apply karhuene-loeve scaling to the jacobian matrix.
Note
This scaling is not necessary for analyses using Schur’s complement, but can be very important for error variance analyses. This operation effectively transfers prior knowledge specified in the parcov to the jacobian and reset parcov to the identity matrix.
- clean()
drop regularization and prior information observation from the jco
- reset_pst(arg)
reset the LinearAnalysis.pst attribute
- Parameters:
arg (str or pyemu.Pst) – the value to assign to the pst attribute
- reset_parcov(arg=None)
reset the parcov attribute to None
- Parameters:
arg (str or pyemu.Matrix) – the value to assign to the parcov attribute. If None, the private __parcov attribute is cleared but not reset
- reset_obscov(arg=None)
reset the obscov attribute to None
- Parameters:
arg (str or pyemu.Matrix) – the value to assign to the obscov attribute. If None, the private __obscov attribute is cleared but not reset
- drop_prior_information()
drop the prior information from the jco and pst attributes
- get(par_names=None, obs_names=None, astype=None)
method to get a new LinearAnalysis class using a subset of parameters and/or observations
- Parameters:
par_names ([‘str]) – par names for new object
obs_names ([‘str]) – obs names for new object
astype (pyemu.Schur or pyemu.ErrVar) – type to cast the new object. If None, return type is same as self
- Returns:
new instance
- Return type:
LinearAnalysis
- adjust_obscov_resfile(resfile=None)
reset the elements of obscov by scaling the implied weights based on the phi components in res_file so that the total phi is equal to the number of non-zero weights.
- Parameters:
resfile (str) – residual file to use. If None, residual file with case name is sought. default is None
Note
calls pyemu.Pst.adjust_weights_resfile()
- get_par_css_dataframe()
get a dataframe of composite scaled sensitivities. Includes both PEST-style and Hill-style.
- Returns:
a dataframe of parameter names, PEST-style and Hill-style composite scaled sensitivity
- Return type:
pandas.DataFrame
- get_cso_dataframe()
get a dataframe of composite observation sensitivity, as returned by PEST in the seo file.
- Returns:
dataframe of observation names and composite observation sensitivity
- Return type:
pandas.DataFrame
Note
That this formulation deviates slightly from the PEST documentation in that the values are divided by (npar-1) rather than by (npar).
The equation is cso_j = ((Q^1/2*J*J^T*Q^1/2)^1/2)_jj/(NPAR-1)
- get_obs_competition_dataframe()
get the observation competition stat a la PEST utility
- Returns:
a dataframe of observation names by observation names with values equal to the PEST competition statistic
- Return type:
pandas.DataFrame
- class pyemu.Cov(x=None, names=[], row_names=[], col_names=[], isdiagonal=False, autoalign=True)
Bases:
Matrix
Diagonal and/or dense Covariance matrices
- Parameters:
x (numpy.ndarray) – numeric values
names ([str]) – list of row and column names
isdigonal (bool) – flag if the Matrix is diagonal
autoalign (bool) – flag to control the autoalignment of Matrix during linear algebra operations
Example:
data = np.random.random((10,10)) names = ["par_{0}".format(i) for i in range(10)] mat = pyemu.Cov(x=data,names=names) mat.to_binary("mat.jco")
Note
row_names and col_names args are supported in the contructor so support inheritance. However, users should only pass names
- property identity
get an identity Cov of the same shape
- Returns:
new Cov instance with identity matrix
- Return type:
Cov
Note
the returned identity matrix has the same row-col names as self
- property zero
get a diagonal instance of Cov with all zeros on the diagonal
- Returns:
new Cov instance with zeros
- Return type:
Cov
- property names
wrapper for getting row_names. row_names == col_names for Cov
- Returns:
list of names
- Return type:
[str]
- condition_on(conditioning_elements)
get a new Covariance object that is conditional on knowing some elements. uses Schur’s complement for conditional Covariance propagation
- Parameters:
conditioning_elements (['str']) – list of names of elements to condition on
- Returns:
new conditional Cov that assumes conditioning_elements have become known
- Return type:
Cov
Example:
prior_cov = pyemu.Cov.from_parameter_data(pst) now_known_pars = pst.adj_par_names[:5] post_cov = prior_cov.condition_on(now_known_pars)
- replace(other)
replace elements in the covariance matrix with elements from other. if other is not diagonal, then this Cov becomes non diagonal
- Parameters:
Cov – the Cov to replace elements in this Cov with
Note
operates in place. Other must have the same row-col names as self
- to_uncfile(unc_file, covmat_file='cov.mat', var_mult=1.0, include_path=False)
write a PEST-compatible uncertainty file
- Parameters:
unc_file (str) – filename of the uncertainty file
covmat_file (str) – covariance matrix filename. Default is “Cov.mat”. If None, and Cov.isdiaonal, then a standard deviation form of the uncertainty file is written. Exception raised if covmat_file is None and not Cov.isdiagonal
var_mult (float) – variance multiplier for the covmat_file entry
include_path (bool) – flag to include the path of unc_file in the name of covmat_file. Default is False - not sure why you would ever make this True…
Example:
cov = pyemu.Cov.from_parameter_data(pst) cov.to_uncfile("my.unc")
- classmethod from_obsweights(pst_file)
instantiates a Cov instance from observation weights in a PEST control file.
- Parameters:
pst_file (str) – pest control file name
- Returns:
a diagonal observation noise covariance matrix derived from the weights in the pest control file. Zero-weighted observations are included with a weight of 1.0e-30
- Return type:
Cov
Note
Calls Cov.from_observation_data()
Example:
obscov = pyemu.Cov.from_obsweights("my.pst")
- classmethod from_observation_data(pst)
instantiates a Cov from pyemu.Pst.observation_data
- Parameters:
pst (pyemu.Pst) – control file instance
- Returns:
a diagonal observation noise covariance matrix derived from the weights in the pest control file. Zero-weighted observations are included with a weight of 1.0e-30
- Return type:
Cov
Example:
obscov = pyemu.Cov.from_observation_data(pst)
- classmethod from_parbounds(pst_file, sigma_range=4.0, scale_offset=True)
Instantiates a Cov from a pest control file parameter data section using parameter bounds as a proxy for uncertainty.
- Parameters:
pst_file (str) – pest control file name
sigma_range (float) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution
scale_offset (bool) – flag to apply scale and offset to parameter upper and lower bounds before calculating varaince. In some cases, not applying scale and offset can result in undefined (log) variance. Default is True.
- Returns:
diagonal parameter Cov matrix created from parameter bounds
- Return type:
Cov
Note
Calls Cov.from_parameter_data()
- classmethod from_parameter_data(pst, sigma_range=4.0, scale_offset=True, subset=None)
Instantiates a Cov from a pest control file parameter data section using parameter bounds as a proxy for uncertainty.
- Parameters:
pst_file (str) – pest control file name
sigma_range (float) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution
scale_offset (bool) – flag to apply scale and offset to parameter upper and lower bounds before calculating varaince. In some cases, not applying scale and offset can result in undefined (log) variance. Default is True.
subset (list-like, optional) – Subset of parameters to draw
- Returns:
diagonal parameter Cov matrix created from parameter bounds
- Return type:
Cov
Note
Calls Cov.from_parameter_data()
- classmethod from_uncfile(filename, pst=None)
instaniates a Cov from a PEST-compatible uncertainty file
- Parameters:
filename (str) – uncertainty file name
pst ('pyemu.Pst`) – a control file instance. this is needed if “first_parameter” and “last_parameter” keywords. Default is None
- Returns:
Cov instance from uncertainty file
- Return type:
Cov
Example:
cov = pyemu.Cov.from_uncfile("my.unc")
- static _get_uncfile_dimensions(filename)
quickly read an uncertainty file to find the dimensions
- classmethod identity_like(other)
Get an identity matrix Cov instance like other Cov
- Parameters:
other (Matrix) – other matrix - must be square
- Returns:
new identity matrix Cov with shape of other
- Return type:
Cov
Note
the returned identity cov matrix is treated as non-diagonal
- to_pearson()
Convert Cov instance to Pearson correlation coefficient matrix
- Returns:
A Matrix of correlation coefs. Return type is Matrix on purpose so that it is clear the returned instance is not a Cov
- Return type:
Matrix
Example:
# plot the posterior parameter correlation matrix import matplotlib.pyplot as plt cov = pyemu.Cov.from_ascii("pest.post.cov") cc = cov.to_pearson() cc.x[cc.x==1.0] = np.NaN plt.imshow(cc)
- class pyemu.Jco(x=None, row_names=[], col_names=[], isdiagonal=False, autoalign=True)
Bases:
Matrix
a thin wrapper class to get more intuitive attribute names. Functions exactly like Matrix
- property par_names
thin wrapper around Matrix.col_names
- Returns:
a list of parameter names
- Return type:
[str]
- property obs_names
thin wrapper around Matrix.row_names
- Returns:
a list of observation names
- Return type:
[‘str’]
- property npar
number of parameters in the Jco
- Returns:
number of parameters (columns)
- Return type:
int
- property nobs
number of observations in the Jco
- Returns:
number of observations (rows)
- Return type:
int
- __init(**kwargs)
Jco constuctor takes the same arguments as Matrix.
- Parameters:
**kwargs (dict) – constructor arguments for Matrix
Example
jco = pyemu.Jco.from_binary(“my.jco”)
- classmethod from_pst(pst, random=False)
construct a new empty Jco from a control file optionally filled with trash
- Parameters:
pst (pyemu.Pst) – a pest control file instance. If type is ‘str’, pst is loaded from filename
random (bool) – flag for contents of the trash matrix. If True, fill with random numbers, if False, fill with zeros Default is False
- Returns:
the new Jco instance
- Return type:
Jco
- class pyemu.Matrix(x=None, row_names=[], col_names=[], isdiagonal=False, autoalign=True)
Bases:
object
Easy linear algebra in the PEST(++) realm
- Parameters:
x (numpy.ndarray) – numeric values
row_names ([str]) – list of row names
col_names (['str']) – list of column names
isdigonal (bool) – flag if the Matrix is diagonal
autoalign (bool) – flag to control the autoalignment of Matrix during linear algebra operations
Example:
data = np.random.random((10,10)) row_names = ["row_{0}".format(i) for i in range(10)] col_names = ["col_{0}".format(j) for j in range(10)] mat = pyemu.Matrix(x=data,row_names=row_names,col_names=col_names) mat.to_binary("mat.jco") # load an existing jacobian matrix jco = pyemu.Jco.from_binary("pest.jco") # form an observation noise covariance matrix from weights obscov = pyemu.Cov.from_observation_data(pst) # form the normal matrix, aligning rows and cols on-the-fly xtqx = jco * obscov.inv * jco.T
Note
this class makes heavy use of property decorators to encapsulate private attributes
- property newx
return a copy of Matrix.x attribute
- Returns:
a copy Matrix.x
- Return type:
numpy.ndarray
- property x
return a reference to Matrix.x
- Returns:
reference to Matrix.x
- Return type:
numpy.ndarray
- property as_2d
get a 2D numeric representation of Matrix.x. If not isdiagonal, simply return reference to Matrix.x, otherwise, constructs and returns a 2D, diagonal ndarray
- Returns:
numpy.ndarray
- Return type:
numpy.ndarray
Exmaple:
# A diagonal cov cov = pyemu.Cov.from_parameter_data x2d = cov.as_2d # a numpy 2d array print(cov.shape,cov.x.shape,x2d.shape)
- property shape
get the implied, 2D shape of Matrix
- Returns:
length of 2 tuple
- Return type:
int
Exmaple:
jco = pyemu.Jco.from_binary("pest.jcb") shape = jco.shape nrow,ncol = shape #unpack to ints
- property ncol
length of second dimension
- Returns:
number of columns
- Return type:
int
- property nrow
length of first dimension
- Returns:
number of rows
- Return type:
int
- property T
wrapper function for Matrix.transpose() method
- Returns:
transpose of Matrix
- Return type:
Matrix
Note
returns a copy of self
A syntatic-sugar overload of Matrix.transpose()
Example:
jcb = pyemu.Jco.from_binary("pest.jcb") jcbt = jcb.T
- property transpose
transpose operation of self
- Returns:
transpose of Matrix
- Return type:
Matrix
Example:
jcb = pyemu.Jco.from_binary("pest.jcb") jcbt = jcb.T
- property inv
inversion operation of Matrix
- Returns:
inverse of Matrix
- Return type:
Matrix
Note
uses numpy.linalg.inv for the inversion
Example:
mat = pyemu.Matrix.from_binary("my.jco") mat_inv = mat.inv mat_inv.to_binary("my_inv.jco")
- property sqrt
element-wise square root operation
- Returns:
element-wise square root of Matrix
- Return type:
Matrix
Note
uses numpy.sqrt
Example:
cov = pyemu.Cov.from_uncfile("my.unc") sqcov = cov.sqrt sqcov.to_uncfile("sqrt_cov.unc")
- property full_s
Get the full singular value matrix
- Returns:
full singular value matrix. Shape is (max(Matrix.shape),max(Matrix.shape)) with zeros along the diagonal from min(Matrix.shape) to max(Matrix.shape)
- Return type:
Matrix
Example:
jco = pyemu.Jco.from_binary("pest.jcb") jco.full_s.to_ascii("full_sing_val_matrix.mat")
- property s
the singular value (diagonal) Matrix
- Returns:
singular value matrix. shape is (min(Matrix.shape),min(Matrix.shape))
- Return type:
Matrix
Example:
# plot the singular spectrum of the jacobian import matplotlib.pyplot as plt jco = pyemu.Jco.from_binary("pest.jcb") plt.plot(jco.s.x) plt.show()
- property u
the left singular vector Matrix
- Returns:
left singular vectors. Shape is (Matrix.shape[0], Matrix.shape[0])
- Return type:
Matrix
Example:
jco = pyemu.Jco.from_binary("pest.jcb") jco.u.to_binary("u.jcb")
- property v
the right singular vector Matrix
- Returns:
right singular vectors. Shape is (Matrix.shape[1], Matrix.shape[1])
- Return type:
Matrix
Example:
jco = pyemu.Jco.from_binary("pest.jcb") jco.v.to_binary("v.jcb")
- property zero2d
get an 2D instance of self with all zeros
- Returns:
Matrix of zeros
- Return type:
Matrix
Example:
jco = pyemu.Jco.from_binary("pest.jcb") zero_jco = jco.zero2d
- integer
- double
- char
- binary_header_dt
- binary_rec_dt
- coo_rec_dt
- par_length = 12
- obs_length = 20
- new_par_length = 200
- new_obs_length = 200
- reset_x(x, copy=True)
reset self.__x private attribute
- Parameters:
x (numpy.ndarray) – the new numeric data
copy (bool) – flag to make a copy of ‘x’. Default is True
- Returns:
None
Note
operates in place
- __str__()
overload of object.__str__()
- Returns:
string representation
- Return type:
str
- __getitem__(item)
a very crude overload of object.__getitem__().
- Parameters:
item (object) – something that can be used as an index
- Returns:
an object that is a sub-matrix of Matrix
- Return type:
Matrix
- __pow__(power)
overload of numpy.ndarray.__pow__() operator
- Parameters:
power (float) – interpreted as follows: -1 = inverse of self, -0.5 = sqrt of inverse of self, 0.5 = sqrt of self. All other positive ints = elementwise self raised to power
- Returns:
a new Matrix object raised to the power power
- Return type:
Matrix
Example:
cov = pyemu.Cov.from_uncfile("my.unc") sqcov = cov**2
- __sub__(other)
- numpy.ndarray.__sub__() overload. Tries to speedup by
checking for scalars of diagonal matrices on either side of operator
- Parameters:
other – (int,`float`,`numpy.ndarray`,`Matrix`): the thing to subtract
- Returns:
the result of subtraction
- Return type:
Matrix
Note
if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If no names are shared, an exception is raised
Example:
jco1 = pyemu.Jco.from_binary("pest.1.jcb") jco2 = pyemu.Jco.from_binary("pest.2.jcb") diff = jco1 - jco2
- __add__(other)
- Overload of numpy.ndarray.__add__() - elementwise addition. Tries to speedup by checking for
scalars of diagonal matrices on either side of operator
- Parameters:
other – (int,`float`,`numpy.ndarray`,`Matrix`): the thing to add
- Returns:
the result of addition
- Return type:
Matrix
Note
if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If no names are shared, an exception is raised. The addition of a scalar does not expand non-zero elements.
Example:
m1 = pyemu.Cov.from_parameter_data(pst) m1_plus_on1 = m1 + 1.0 #add 1.0 to all non-zero elements m2 = m1.copy() m2_plus_m1 = m1 + m2
- hadamard_product(other)
Overload of numpy.ndarray.__mult__(): element-wise multiplication. Tries to speedup by checking for scalars of diagonal matrices on either side of operator
- Parameters:
other – (int,`float`,`numpy.ndarray`,`Matrix`): the thing to multiply
- Returns:
the result of multiplication
- Return type:
Matrix
Note
if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are shared, an excetion is raised
Example:
cov = pyemu.Cov.from_parameter_data(pst) cov2 = cov * 10 cov3 = cov * cov2
- __mul__(other)
Dot product multiplication overload. Tries to speedup by checking for scalars or diagonal matrices on either side of operator
- Parameters:
other – (int,`float`,`numpy.ndarray`,`Matrix`): the thing to dot product
- Returns:
the result of dot product
- Return type:
Matrix
Note
if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are found, an exception is raised
Example:
jco = pyemu.Jco.from_binary("pest.jcb") cov = pyemu.Cov.from_parmaeter_data(pst) # get the forecast prior covariance matrix forecast_cov = jco.get(pst.forecast_names).T * cov * jco.get(pst.forecast_names)
- __rmul__(other)
Reverse order Dot product multiplication overload.
- Parameters:
other – (int,`float`,`numpy.ndarray`,`Matrix`): the thing to dot product
- Returns:
the result of dot product
- Return type:
Matrix
Note
if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are found, an exception is raised
Example:
# multiply by a scalar jco = pyemu.Jco.from_binary("pest.jcb") jco_times_10 = 10 * jco
- __set_svd()
private method to set SVD components.
Note: this should not be called directly
- mult_isaligned(other)
check if matrices are aligned for dot product multiplication
- Parameters:
other (Matrix) – the other matrix to check for alignment with
- Returns:
True if aligned, False if not aligned
- Return type:
bool
- element_isaligned(other)
check if matrices are aligned for element-wise operations
- Parameters:
other (Matrix) – the other matrix to check for alignment with
- Returns:
True if aligned, False if not aligned
- Return type:
bool
- to_2d()
- get a 2D Matrix representation of Matrix. If not Matrix.isdiagonal, simply
return a copy of Matrix, otherwise, constructs and returns a new Matrix instance that is stored as diagonal
- Returns:
non-diagonal form of Matrix
- Return type:
Martrix
Exmaple:
# A diagonal cov cov = pyemu.Cov.from_parameter_data cov2d = cov.as_2d # a numpy 2d array print(cov.shape,cov.x.shape,cov2d.shape,cov2d.x.shape)
- static get_maxsing_from_s(s, eigthresh=1e-05)
static method to work out the maxsing for a given singular spectrum
- Parameters:
s (numpy.ndarray) – 1-D array of singular values. This array should come from calling either numpy.linalg.svd or from the pyemu.Matrix.s.x attribute
eigthresh (float) – the ratio of smallest to largest singular value to retain. Since it is assumed that s is sorted from largest to smallest, once a singular value is reached that yields a ratio with the first (largest) singular value, the index of this singular is returned.
- Returns:
the index of the singular value whos ratio with the first singular value is less than or equal to eigthresh
- Return type:
int
Example:
jco = pyemu.Jco.from_binary("pest.jco") max_sing = pyemu.Matrix.get_maxsing_from_s(jco.s,eigthresh=pst.svd_data.eigthresh)
- get_maxsing(eigthresh=1e-05)
Get the number of singular components with a singular value ratio greater than or equal to eigthresh
- Args:
- eigthresh (float): the ratio of smallest to largest
singular value to retain. Since it is assumed that s is sorted from largest to smallest, once a singular value is reached that yields a ratio with the first (largest) singular value, the index of this singular is returned.
- Returns:
the index of the singular value whos ratio with the first singular value is less than or equal to eigthresh
- Return type:
int
Note
this method calls the static method Matrix.get_maxsing_from_s() with Matrix.s.x
Example:
jco = pyemu.Jco.from_binary("pest.jco") max_sing = jco.get_maxsing(eigthresh=pst.svd_data.eigthresh)
- pseudo_inv_components(maxsing=None, eigthresh=1e-05, truncate=True)
Get the (optionally) truncated SVD components
- Parameters:
maxsing (int, optional) – the number of singular components to use. If None, maxsing is calculated using Matrix.get_maxsing() and eigthresh
eigthresh – (float, optional): the ratio of largest to smallest singular components to use for truncation. Ignored if maxsing is not None. Default is 1.0e-5
truncate (bool) – flag to truncate components. If False, U, s, and V will be zeroed out at locations greater than maxsing instead of truncated. Default is True
- Returns:
tuple containing
Matrix: (optionally truncated) left singular vectors
Matrix: (optionally truncated) singular value matrix
Matrix: (optionally truncated) right singular vectors
Example:
mat = pyemu.Matrix.from_binary("my.jco") u1,s1,v1 = mat.pseudo_inv_components(maxsing=10) resolution_matrix = v1 * v1.T resolution_matrix.to_ascii("resol.mat")
- pseudo_inv(maxsing=None, eigthresh=1e-05)
The pseudo inverse of self. Formed using truncated singular value decomposition and Matrix.pseudo_inv_components
- Parameters:
maxsing (int, optional) – the number of singular components to use. If None, maxsing is calculated using Matrix.get_maxsing() and eigthresh
eigthresh – (float, optional): the ratio of largest to smallest singular components to use for truncation. Ignored if maxsing is not None. Default is 1.0e-5
- Returns:
the truncated-SVD pseudo inverse of Matrix (V_1 * s_1^-1 * U^T)
- Return type:
Matrix
Example:
jco = pyemu.Jco.from_binary("pest.jcb") jco_psinv = jco.pseudo_inv(pst.svd_data.maxsing,pst.svd_data.eigthresh) jco_psinv.to_binary("pseudo_inv.jcb")
- static find_rowcol_indices(names, row_names, col_names, axis=None)
fast(er) look of row and colum names indices
- Parameters:
names ([str]) – list of names to look for in row_names and/or col_names names
row_names ([str]) – list of row names
col_names ([str]) – list of column names
axis (int, optional) – axis to search along. If None, search both. Default is None
- Returns:
array of (integer) index locations. If axis is None, a 2 numpy.ndarrays of both row and column name indices is returned
- Return type:
numpy.ndarray
- indices(names, axis=None)
- get the row and col indices of names. If axis is None, two ndarrays
are returned, corresponding the indices of names for each axis
- Parameters:
names ([str]) – list of names to look for in row_names and/or col_names names
row_names ([str]) – list of row names
col_names ([str]) – list of column names
axis (int, optional) – axis to search along. If None, search both. Default is None
- Returns:
array of (integer) index locations. If axis is None, a 2 numpy.ndarrays of both row and column name indices is returned
- Return type:
numpy.ndarray
Note
thin wrapper around Matrix.find_rowcol_indices static method
- align(names, axis=None)
reorder Matrix by names in place. If axis is None, reorder both indices
- Parameters:
names (['str']) – names in Matrix.row_names and or Matrix.col_names
axis (int, optional) – the axis to reorder. if None, reorder both axes
Note
Works in place. Is called programmatically during linear algebra operations
Example:
# load a jco that has more obs (rows) than non-zero weighted obs # in the control file jco = pyemu.Jco.from_binary("pest.jcb") # get an obs noise cov matrix obscov = pyemu.Cov.from_observation_data(pst) jco.align(obscov.row_names,axis=0)
- get(row_names=None, col_names=None, drop=False)
get a new Matrix instance ordered on row_names or col_names
- Parameters:
row_names (['str'], optional) – row_names for new Matrix. If None, all row_names are used.
col_names (['str'], optional) – col_names for new Matrix. If None, all col_names are used.
drop (bool) – flag to remove row_names and/or col_names from this Matrix
- Returns:
a new Matrix
- Return type:
Matrix
- Example::
# load a jco that has more obs (rows) than non-zero weighted obs # in the control file jco = pyemu.Jco.from_binary(“pest.jcb”) # get an obs noise cov matrix obscov = pyemu.Cov.from_observation_data(pst) nnz_jco = jco.get(row_names = obscov.row_names)
- copy()
get a copy of Matrix
- Returns:
copy of this Matrix
- Return type:
Matrix
- drop(names, axis)
drop elements from Matrix in place
- Parameters:
names (['str']) – list of names to drop
axis (int) – the axis to drop from. must be in [0,1]
- extract(row_names=None, col_names=None)
wrapper method that Matrix.gets() then Matrix.drops() elements. one of row_names or col_names must be not None.
- Parameters:
row_names (['str'], optional) – row_names to extract. If None, all row_names are retained.
col_names (['str'], optional) – col_names to extract. If None, all col_names are retained.
- Returns:
the extract sub-matrix defined by row_names and/or col_names
- Return type:
Matrix
Example:
cov = pyemu.Cov.from_parameter_data(pst) hk_cov = cov.extract(row_names=["hk1","hk2","hk3"])
- get_diagonal_vector(col_name='diag')
Get a new Matrix instance that is the diagonal of self. The shape of the new matrix is (self.shape[0],1). Self must be square
- Parameters:
col_name (str) – the name of the single column in the new Matrix
- Returns:
vector-shaped Matrix instance of the diagonal of this Matrix
- Return type:
Matrix
Example:
cov = pyemu.Cov.from_unc_file("my.unc") cov_diag = cov.get_diagonal_vector() print(cov_diag.col_names)
- to_coo(filename, droptol=None, chunk=None)
write an extended PEST-format binary file. The data format is [int,int,float] for i,j,value. It is autodetected during the read with Matrix.from_binary().
- Parameters:
filename (str) – filename to save binary file
droptol (float) – absolute value tolerance to make values smaller droptol than zero. Default is None (no dropping)
chunk (int) – number of elements to write in a single pass. Default is None, which writes the entire numeric part of the Matrix at once. This is faster but requires more memory.
Note
This method is needed when the number of dimensions times 2 is larger than the max value for a 32-bit integer. happens! This method is used by pyemu.Ensemble.to_binary()
- to_dense(filename, close=True)
experimental new dense matrix storage format to support faster I/O with ensembles
- Parameters:
filename (str) – the filename to save to
close (bool) – flag to close the filehandle after saving
- Returns:
the file handle. Only returned if close is False
- Return type:
f (file)
Note
calls Matrix.write_dense()
- static write_dense(filename, row_names, col_names, data, close=False)
experimental new dense matrix storage format to support faster I/O with ensembles
- Parameters:
filename (str or file) – the file to write to
row_names ([str]) – row names of the matrix
col_names ([str]) – col names of the matrix
data (np.ndarray) – matrix elements
close (bool) – flag to close the file after writing
- Returns:
the file handle. Only returned if close is False
- Return type:
f (file)
- to_binary(filename, droptol=None, chunk=None)
write a PEST-compatible binary file. The format is the same as the format used to storage a PEST Jacobian matrix
- Parameters:
filename (str) – filename to save binary file
droptol (float) – absolute value tolerance to make values smaller droptol than zero. Default is None (no dropping)
chunk (int) – number of elements to write in a single pass. Default is None, which writes the entire numeric part of the Matrix at once. This is faster but requires more memory.
- static read_dense(filename, forgive=False, close=True)
read a dense-format binary file.
- Parameters:
filename (str) – the filename or the open filehandle
forgive (bool) – flag to forgive incomplete records. If True and an incomplete record is encountered, only the previously read records are returned. If False, an exception is raised for an incomplete record
close (bool) – flag to close the filehandle. Default is True
- Returns:
tuple containing
numpy.ndarray: the numeric values in the file
[‘str’]: list of row names
[`str`]: list of col_names
- classmethod from_binary(filename, forgive=False)
class method load from PEST-compatible binary file into a Matrix instance
- Parameters:
filename (str) – filename to read
forgive (bool) – flag to forgive incomplete data records. Only applicable to dense binary format. Default is False
- Returns:
Matrix loaded from binary file
- Return type:
Matrix
Example:
mat = pyemu.Matrix.from_binary("my.jco") cov = pyemu.Cov.from_binary("large_cov.jcb")
- static read_binary_header(filename)
read the first elements of a PEST(++)-style binary file to get format and dimensioning information.
- Parameters:
filename (str) – the filename of the binary file
- Returns:
tuple containing
int: the itemp1 value
int: the itemp2 value
int: the icount value
- static read_binary(filename, forgive=False)
static method to read PEST-format binary files
- Parameters:
filename (str) – filename to read
forgive (bool) – flag to forgive incomplete data records. Only applicable to dense binary format. Default is False
- Returns:
tuple containing
numpy.ndarray: the numeric values in the file
[‘str’]: list of row names
[`str`]: list of col_names
- static from_fortranfile(filename)
- a binary load method to accommodate one of the many
bizarre fortran binary writing formats
- Parameters:
filename (str) – name of the binary matrix file
- Returns:
tuple containing
numpy.ndarray: the numeric values in the file
[‘str’]: list of row names
[`str`]: list of col_names
- to_ascii(filename, icode=2)
write a PEST-compatible ASCII Matrix/vector file
- Parameters:
filename (str) – filename to write to
icode (int, optional) – PEST-style info code for matrix style. Default is 2.
Note
if icode == -1, a 1-d vector is written that represents a diagonal matrix. An exception is raised if icode == -1 and isdiagonal is False
- classmethod from_ascii(filename)
load a PEST-compatible ASCII matrix/vector file into a Matrix instance
- Parameters:
filename (str) – name of the file to read
- Returns:
Matrix loaded from ASCII file
- Return type:
Matrix
Example:
mat = pyemu.Matrix.from_ascii("my.mat") cov = pyemi.Cov.from_ascii("my.cov")
- static read_ascii(filename)
read a PEST-compatible ASCII matrix/vector file
- Parameters:
filename (str) – file to read from
- Returns:
tuple containing
numpy.ndarray: numeric values
[‘str’]: list of row names
[`str`]: list of column names
bool: diagonal flag
- df()
wrapper of Matrix.to_dataframe()
- classmethod from_dataframe(df)
- class method to create a new Matrix instance from a
pandas.DataFrame
- Parameters:
df (pandas.DataFrame) – dataframe
- Returns:
Matrix instance derived from df.
- Return type:
Matrix
Example:
df = pd.read_csv("my.csv") mat = pyemu.Matrix.from_dataframe(df)
- classmethod from_names(row_names, col_names, isdiagonal=False, autoalign=True, random=False)
class method to create a new Matrix instance from row names and column names, filled with trash
- Parameters:
row_names (['str']) – row names for the new Matrix
col_names (['str']) – col_names for the new matrix
isdiagonal (bool, optional) – flag for diagonal matrix. Default is False
autoalign (bool, optional) – flag for autoaligning new matrix during linear algebra calcs. Default is True
random (bool) – flag for contents of the trash matrix. If True, fill with random numbers, if False, fill with zeros Default is False
- Returns:
the new Matrix instance
- Return type:
Matrix
Example:
row_names = ["row_1","row_2"] col_names = ["col_1,"col_2"] m = pyemu.Matrix.from_names(row_names,col_names)
- to_dataframe()
return a pandas.DataFrame representation of Matrix
- Returns:
a dataframe derived from Matrix
- Return type:
pandas.DataFrame
Note
if self.isdiagonal is True, the full matrix is used to fill the dataframe - lots of zeros.
- extend(other)
extend Matrix with the elements of other, returning a new matrix.
Args: other (Matrix): the Matrix to extend self by
- Returns:
new, extended Matrix
- Return type:
Matrix
Note
No row or column names can be shared between self and other
Example:
jco1 = pyemu.Jco.from_binary("pest_history.jco") jco2 = pyemu.Jco.from_binary("pest_forecast.jco") jco_ext = jco1.extend(jco2)
- class pyemu.Pst(filename, load=True, resfile=None)
Bases:
object
All things PEST(++) control file
- Parameters:
filename (str) – the name of the control file
load (bool, optional) – flag to load the control file. Default is True
resfile (str, optional) – corresponding residual file. If None, a residual file with the control file base name is sought. Default is None
Note
This class is the primary mechanism for dealing with PEST control files. Support is provided for constructing new control files as well as manipulating existing control files.
Example:
pst = pyemu.Pst("my.pst") pst.control_data.noptmax = -1 pst.write("my_new.pst")
- property phi
get the weighted total objective function.
- Returns:
sum of squared residuals
- Return type:
float
Note
Requires Pst.res (the residuals file) to be available
- property phi_components
get the individual components of the total objective function
- Returns:
dictionary of observation group, contribution to total phi
- Return type:
dict
Note
Requires Pst.res (the residuals file) to be available
- property phi_components_normalized
- get the individual components of the total objective function
normalized to the total PHI being 1.0
- Returns:
dictionary of observation group, normalized contribution to total phi
- Return type:
dict
Note
Requires Pst.res (the residuals file) to be available
- property res
get the residuals dataframe attribute
- Returns:
a dataframe containing the residuals information.
- Return type:
pandas.DataFrame
Note
if the Pst.__res attribute has not been loaded, this call loads the res dataframe from a file
Example:
# print the observed and simulated values for non-zero weighted obs print(pst.res.loc[pst.nnz_obs_names,["modelled","measured"]])
- property nprior
number of prior information equations
- Returns:
the number of prior info equations
- Return type:
int
- property nnz_obs
get the number of non-zero weighted observations
- Returns:
the number of non-zeros weighted observations
- Return type:
int
- property nobs
get the number of observations
- Returns:
the number of observations
- Return type:
int
- property npar_adj
get the number of adjustable parameters (not fixed or tied)
- Returns:
the number of adjustable parameters
- Return type:
int
- property npar
get number of parameters
- Returns:
the number of parameters
- Return type:
int
- property forecast_names
get the forecast names from the pestpp options (if any). Returns None if no forecasts are named
- Returns:
a list of forecast names.
- Return type:
[str]
- property obs_groups
get the observation groups
- Returns:
a list of unique observation groups
- Return type:
[str]
- property nnz_obs_groups
- get the observation groups that contain at least one non-zero weighted
observation
- Returns:
a list of observation groups that contain at least one non-zero weighted observation
- Return type:
[str]
- property adj_par_groups
get the parameter groups with atleast one adjustable parameter
- Returns:
a list of parameter groups with at least one adjustable parameter
- Return type:
[str]
- property par_groups
get the parameter groups
- Returns:
a list of parameter groups
- Return type:
[str]
- property prior_groups
get the prior info groups
- Returns:
a list of prior information groups
- Return type:
[str]
- property prior_names
get the prior information names
- Returns:
a list of prior information names
- Return type:
[str]
- property par_names
get the parameter names
- Returns:
a list of parameter names
- Return type:
[str]
- property adj_par_names
get the adjustable (not fixed or tied) parameter names
- Returns:
list of adjustable (not fixed or tied) parameter names
- Return type:
[str]
- property obs_names
get the observation names
- Returns:
a list of observation names
- Return type:
[str]
- property nnz_obs_names
get the non-zero weight observation names
- Returns:
a list of non-zero weighted observation names
- Return type:
[str]
- property zero_weight_obs_names
get the zero-weighted observation names
- Returns:
a list of zero-weighted observation names
- Return type:
[str]
- property estimation
check if the control_data.pestmode is set to estimation
- Returns:
True if control_data.pestmode is estmation, False otherwise
- Return type:
bool
- property tied
list of tied parameter names
- Returns:
a dataframe of tied parameter information. Columns of tied are parnme and partied. Returns None if no tied parameters are found.
- Return type:
pandas.DataFrame
- property template_files
list of template file names
- Returns:
- a list of template file names, extracted from
Pst.model_input_data.pest_file. Returns None if this attribute is None
- Return type:
[str]
Note
Use Pst.model_input_data to access the template-input file information for writing/modification
- property input_files
list of model input file names
- Returns:
- a list of model input file names, extracted from
Pst.model_input_data.model_file. Returns None if this attribute is None
- Return type:
[str]
Note
Use Pst.model_input_data to access the template-input file information for writing/modification
- property instruction_files
list of instruction file names :returns:
- a list of instruction file names, extracted from
- Pst.model_output_data.pest_file. Returns None if this
attribute is None
- Return type:
[str]
Note
Use Pst.model_output_data to access the instruction-output file information for writing/modification
- property output_files
list of model output file names :returns:
- a list of model output file names, extracted from
Pst.model_output_data.model_file. Returns None if this attribute is None
- Return type:
[str]
Note
Use Pst.model_output_data to access the instruction-output file information for writing/modification
- property less_than_obs_constraints
get the names of the observations that are listed as active (non-zero weight) less than inequality constraints.
- Returns:
names of observations that are non-zero weighted less than constraints (obgnme starts with ‘l_’ or “less”)
- Return type:
pandas.Series
Note
Zero-weighted obs are skipped
- property less_than_pi_constraints
get the names of the prior information eqs that are listed as active (non-zero weight) less than inequality constraints.
- Returns:
names of prior information that are non-zero weighted less than constraints (obgnme starts with “l_” or “less”)
- Return type:
pandas.Series
Note
Zero-weighted pi are skipped
- property greater_than_obs_constraints
get the names of the observations that are listed as active (non-zero weight) greater than inequality constraints.
- Returns:
names obseravtions that are non-zero weighted greater than constraints (obgnme startsiwth “g_” or “greater”)
- Return type:
pandas.Series
Note
Zero-weighted obs are skipped
- property greater_than_pi_constraints
get the names of the prior information eqs that are listed as active (non-zero weight) greater than inequality constraints.
- Returns:
pandas.Series names of prior information that are non-zero weighted greater than constraints (obgnme startsiwth “g_” or “greater”)
Note
Zero-weighted pi are skipped
- parameter_data
‘* parameter data’ information. Columns are standard PEST variable names
Example:
pst.parameter_data.loc[:,"partrans"] = "log" pst.parameter_data.loc[:,"parubnd"] = 10.0
- Type:
pandas.DataFrame
- observation_data
‘* observation data’ information. Columns are standard PEST variable names
Example:
pst.observation_data.loc[:,"weight"] = 1.0 pst.observation_data.loc[:,"obgnme"] = "obs_group"
- Type:
pandas.DataFrame
- prior_information
‘* prior information’ data. Columns are standard PEST variable names
- Type:
pandas.DataFrame
- control_data
‘* control data’ information. Access with standard PEST variable names
Example:
pst.control_data.noptmax = 2 pst.control_data.pestmode = "estimation"
- svd_data
‘* singular value decomposition’ section information. Access with standard PEST variable names
Example:
pst.svd_data.maxsing = 100
- reg_data
‘* regularization’ section information. Access with standard PEST variable names.
Example:
pst.reg_data.phimlim = 1.00 #yeah right!
- __setattr__(key, value)
Implement setattr(self, name, value).
- classmethod from_par_obs_names(par_names=['par1'], obs_names=['obs1'])
construct a shell Pst instance from parameter and observation names
- Parameters:
par_names ([str]) – list of parameter names. Default is [par1]
obs_names ([str]) – list of observation names. Default is [obs1]
Note
While this method works, it does not make template or instruction files. Users are encouraged to use Pst.from_io_files() for more usefulness
Example:
par_names = ["par1","par2"] obs_names = ["obs1","obs2"] pst = pyemu.Pst.from_par_obs_names(par_names,obs_names)
- static get_phi_components(ogroups, res, obs, pi_ogroups=None, pi_df=None)
- set_res(res)
reset the private Pst.res attribute.
- Parameters:
res – (pandas.DataFrame or str): something to use as Pst.res attribute. If res is str, a dataframe is read from file res
- static _read_df(f, nrows, names, converters, defaults=None)
a private method to read part of an open file into a pandas.DataFrame.
- Parameters:
f (file) – open file handle
nrows (int) – number of rows to read
names ([str]) – names to set the columns of the dataframe with
converters (dict) – dictionary of lambda functions to convert strings to numerical format
defaults (dict) – dictionary of default values to assign columns. Default is None
- Returns:
dataframe of control file section info
- Return type:
pandas.DataFrame
- _read_line_comments(f, forgive)
private method to read comment lines from a control file
- _read_section_comments(f, forgive)
private method to read comments from a section of the control file
- static _parse_external_line(line, pst_path='.')
private method to parse a file for external file info
- static _parse_path_agnostic(filename)
private method to parse a file path for any os sep
- static _cast_df_from_lines(section, lines, fieldnames, converters, defaults, alias_map={}, pst_path='.')
private method to cast a pandas dataframe from raw control file lines
- _cast_prior_df_from_lines(section, lines, pst_path='.')
private method to cast prior information lines to a dataframe
- _load_version2(filename)
private method to load a version 2 control file
- load(filename)
entry point load the pest control file.
- Parameters:
filename (str) – pst filename
Note
This method is called from the Pst construtor unless the load arg is False.
- _reset_file_paths_os()
- _try_load_longnames()
- _parse_pestpp_line(line)
- _update_control_section()
private method to synchronize the control section counters with the various parts of the control file. This is usually called during the Pst.write() method.
- rectify_pgroups()
synchronize parameter groups section with the parameter data section
Note
This method is called during Pst.write() to make sure all parameter groups named in * parameter data are included. This is so users don’t have to manually keep this section up. This method can also be called during control file modifications to see what parameter groups are present and prepare for modifying the default values in the * parameter group section
Example:
pst = pyemu.Pst("my.pst") pst.parameter_data.loc["par1","pargp"] = "new_group" pst.rectify_groups() pst.parameter_groups.loc["new_group","derinc"] = 1.0
- _parse_pi_par_names()
private method to get the parameter names from prior information equations. Sets a ‘names’ column in Pst.prior_information that is a list of parameter names
- add_pi_equation(par_names, pilbl=None, rhs=0.0, weight=1.0, obs_group='pi_obgnme', coef_dict={})
a helper to construct a new prior information equation.
- Parameters:
par_names ([str]) – parameter names in the equation
pilbl (str) – name to assign the prior information equation. If None, a generic equation name is formed. Default is None
rhs (float) – the right-hand side of the pi equation
weight (float) – the weight of the equation
obs_group (str) – the observation group for the equation. Default is ‘pi_obgnme’
coef_dict (dict) – a dictionary of parameter name, coefficient pairs to assign leading coefficients for one or more parameters in the equation. If a parameter is not listed, 1.0 is used for its coefficients. Default is {}
Example:
pst = pyemu.Pst("pest.pst") # add a pi equation for the first adjustable parameter pst.add_pi_equation(pst.adj_par_names[0],pilbl="pi1",rhs=1.0) # add a pi equation for 1.5 times the 2nd and 3 times the 3rd adj pars to sum together to 2.0 names = pst.adj_par_names[[1,2]] pst.add_pi_equation(names,coef_dict={names[0]:1.5,names[1]:3})
- rectify_pi()
rectify the prior information equation with the current state of the parameter_data dataframe.
Note
Equations that list fixed, tied or missing parameters are removed completely even if adjustable parameters are also listed in the equation. This method is called during Pst.write()
- _write_df(name, f, df, formatters, columns)
private method to write a dataframe to a control file
- sanity_checks(forgive=False)
some basic check for strangeness
- Parameters:
forgive (bool) – flag to forgive (warn) for issues. Default is False
Note
checks for duplicate names, atleast 1 adjustable parameter and at least 1 non-zero-weighted observation
Not nearly as comprehensive as pestchek
Example:
pst = pyemu.Pst("pest.pst") pst.sanity_checks()
- _write_version2(new_filename, use_pst_path=True, pst_rel_path='.')
private method to write a version 2 control file
- write(new_filename, version=None, check_interface=False)
main entry point to write a pest control file.
- Parameters:
new_filename (str) – name of the new pest control file
version (int) – flag for which version of control file to write (must be 1 or 2). if None, uses the number of pars to decide: if number of pars iis greater than 10,000, version 2 is used.
check_interface (bool) – flag to check the control file par and obs names against the names found in the template and instruction files. Default is False
Example:
pst = pyemu.Pst("my.pst") pst.parrep("my.par") pst.write(my_new.pst") #write a version 2 control file pst.write("my_new_v2.pst",version=2)
- _rectify_parchglim()
private method to just fix the parchglim vs cross zero issue
- _write_version1(new_filename)
private method to write a version 1 pest control file
- bounds_report(iterations=None)
report how many parameters are at bounds. If ensemble, the base enbsemble member is evaluated
- Parameters:
iterations ([int]) – a list of iterations for which a bounds report is requested If None, all iterations for which par files are located are reported. Default is None
- Returns:
- a pandas DataFrame object with rows being parameter groups and columns
<iter>_num_at_ub, <iter>_num_at_lb, and <iter>_total_at_bounds row 0 is total at bounds, subsequent rows correspond with groups
- Return type:
df
Example
pst = pyemu.Pst(“my.pst”) df = pst.bound_report(iterations=[0,2,3])
- get(par_names=None, obs_names=None)
get a new pst object with subset of parameters and/or observations
- Parameters:
par_names ([str]) – a list of parameter names to have in the new Pst instance. If None, all parameters are in the new Pst instance. Default is None
obs_names ([str]) – a list of observation names to have in the new Pst instance. If None, all observations are in teh new Pst instance. Default is None
- Returns:
a new Pst instance
- Return type:
Pst
Note
passing par_names as None and obs_names as None effectively generates a copy of the current Pst
Does not modify model i/o files - this is just a method for performing pyemu operations
Example:
pst = pyemu.Pst("pest.pst") new_pst = pst.get(pst.adj_par_names[0],pst.obs_names[:10])
- parrep(parfile=None, enforce_bounds=True, real_name=None, noptmax=0, binary_ens_file=False)
- replicates the pest parrep util. replaces the parval1 field in the
parameter data section dataframe with values in a PEST parameter file or a single realization from an ensemble parameter csv file
- Parameters:
parfile (str, optional) – parameter file to use. If None, try to find and use a parameter file that corresponds to the case name. If parfile has extension ‘.par’ a single realization parameter file is used If parfile has extention ‘.csv’ an ensemble parameter file is used which invokes real_name If parfile has extention ‘.jcb’ a binary ensemble parameter file is used which invokes real_name Default is None
enforce_bounds (bool, optional) – flag to enforce parameter bounds after parameter values are updated. This is useful because PEST and PEST++ round the parameter values in the par file, which may cause slight bound violations. Default is True
real_name (str or int, optional) – name of the ensemble realization to use for updating the parval1 value in the parameter data section dataframe. If None, try using “base”. If “base” not present, use the real_name with smallest index number. Ignored if parfile is of the PEST parameter file format (e.g. not en ensemble)
noptmax (int, optional) – Value with which to update the pst.control_data.noptmax value Default is 0.
binary_ens_file (bool) – If True, use binary format to load ensemble file, else assume it’s a CSV file
Example:
pst = pyemu.Pst("pest.pst") pst.parrep("pest.1.base.par") pst.control_data.noptmax = 0 pst.write("pest_1.pst")
- adjust_weights_discrepancy(resfile=None, original_ceiling=True, bygroups=False)
adjusts the weights of each non-zero weight observation based on the residual in the pest residual file so each observations contribution to phi is 1.0 (e.g. Mozorov’s discrepancy principal)
- Parameters:
resfile (str) – residual file name. If None, try to use a residual file with the Pst case name. Default is None
original_ceiling (bool) – flag to keep weights from increasing - this is generally a good idea. Default is True
bygroups (bool) – flag to adjust weights by groups. If False, the weight of each non-zero weighted observation is adjusted individually. If True, intergroup weighting is preserved (the contribution to each group is used) but this may result in some strangeness if some observations in a group have a really low phi already.
Example:
pst = pyemu.Pst("my.pst") print(pst.phi) #assumes "my.res" is colocated with "my.pst" pst.adjust_weights_discrepancy() print(pst.phi) # phi should equal number of non-zero observations
- _adjust_weights_by_phi_components(components, original_ceiling)
private method that resets the weights of observations by group to account for residual phi components.
- Parameters:
components (dict) – a dictionary of obs group:phi contribution pairs
original_ceiling (bool) – flag to keep weights from increasing.
- __reset_weights(target_phis, res_idxs, obs_idxs)
private method to reset weights based on target phi values for each group. This method should not be called directly
- Parameters:
target_phis (dict) – target phi contribution for groups to reweight
res_idxs (dict) – the index positions of each group of interest in the res dataframe
obs_idxs (dict) – the index positions of each group of interest in the observation data dataframe
- _adjust_weights_by_list(obslist, weight)
a private method to reset the weight for a list of observation names. Supports the data worth analyses in pyemu.Schur class. This method only adjusts observation weights in the current weight is nonzero. User beware!
- Parameters:
obslist ([str]) – list of observation names
weight (float) – new weight to assign
- adjust_weights(obs_dict=None, obsgrp_dict=None)
reset the weights of observations or observation groups to contribute a specified amount to the composite objective function
- Parameters:
obs_dict (dict, optional) – dictionary of observation name,new contribution pairs
obsgrp_dict (dict, optional) – dictionary of obs group name,contribution pairs
Notes
If a group is assigned a contribution of 0, all observations in that group will be assigned zero weight.
If a group is assigned a nonzero contribution AND all observations in that group start with zero weight, the observations will be assigned weight of 1.0 to allow balancing.
If groups obsgrp_dict is not passed, all nonzero
Example:
pst = pyemu.Pst("my.pst") # adjust a single observation pst.adjust_weights(obs_dict={"obs1":10}) # adjust a single observation group pst.adjust_weights(obsgrp_dict={"group1":100.0}) # make all non-zero weighted groups have a contribution of 100.0 balanced_groups = {grp:100 for grp in pst.nnz_obs_groups} pst.adjust_weights(obsgrp_dict=balanced_groups)
- proportional_weights(fraction_stdev=1.0, wmax=100.0, leave_zero=True)
setup weights inversely proportional to the observation value
- Parameters:
fraction_stdev (float, optional) – the fraction portion of the observation val to treat as the standard deviation. set to 1.0 for inversely proportional. Default is 1.0
wmax (float, optional) – maximum weight to allow. Default is 100.0
leave_zero (bool, optional) – flag to leave existing zero weights. Default is True
Example:
pst = pyemu.Pst("pest.pst") # set the weights of the observations to 20% of the observed value pst.proportional_weights(fraction_stdev=0.2,wmax=10) pst.write("pest_propo.pst")
- calculate_pertubations()
experimental method to calculate finite difference parameter pertubations.
Note
The pertubation values are added to the Pst.parameter_data attribute - user beware!
- build_increments()
experimental method to calculate parameter increments for use in the finite difference pertubation calculations
Note
user beware!
- add_transform_columns()
add transformed values to the Pst.parameter_data attribute
Note
adds parval1_trans, parlbnd_trans and parubnd_trans to Pst.parameter_data
Example:
pst = pyemu.Pst("pest.pst") pst.add_transform_columns() print(pst.parameter_data.parval1_trans
- enforce_bounds()
enforce bounds violation
Note
cheap enforcement of simply bringing violators back in bounds
Example:
pst = pyemu.Pst("pest.pst") pst.parrep("random.par") pst.enforce_bounds() pst.write("pest_rando.pst")
- classmethod from_io_files(tpl_files, in_files, ins_files, out_files, pst_filename=None, pst_path=None)
create a Pst instance from model interface files.
- Parameters:
tpl_files ([str]) – list of template file names
in_files ([str]) – list of model input file names (pairs with template files)
ins_files ([str]) – list of instruction file names
out_files ([str]) – list of model output file names (pairs with instruction files)
pst_filename (str) – name of control file to write. If None, no file is written. Default is None
pst_path ('str') – the path from the control file to the IO files. For example, if the control will be in the same directory as the IO files, then pst_path should be ‘.’. Default is None, which doesnt do any path manipulation on the I/O file names
- Returns:
new control file instance with parameter and observation names found in tpl_files and ins_files, repsectively.
- Return type:
Pst
Note
calls pyemu.helpers.pst_from_io_files()
Assigns generic values for parameter info. Tries to use INSCHEK to set somewhat meaningful observation values
all file paths are relatively to where python is running.
Example:
tpl_files = ["my.tpl"] in_files = ["my.in"] ins_files = ["my.ins"] out_files = ["my.out"] pst = pyemu.Pst.from_io_files(tpl_files,in_files,ins_files,out_files) pst.control_data.noptmax = 0 pst.write("my.pst)
- add_parameters(template_file, in_file=None, pst_path=None)
add new parameters to an existing control file
- Parameters:
template_file (str) – template file with (possibly) some new parameters
in_file (str) – model input file. If None, template_file.replace(‘.tpl’,’’) is used. Default is None.
pst_path (str) – the path to append to the template_file and in_file in the control file. If not None, then any existing path in front of the template or in file is split off and pst_path is prepended. If python is being run in a directory other than where the control file will reside, it is useful to pass pst_path as .. Default is None
- Returns:
the data for the new parameters that were added. If no new parameters are in the new template file, returns None
- Return type:
pandas.DataFrame
Note
populates the new parameter information with default values
Example:
pst = pyemu.Pst(os.path.join("template","my.pst")) pst.add_parameters(os.path.join("template","new_pars.dat.tpl",pst_path=".") pst.write(os.path.join("template","my_new.pst")
- drop_observations(ins_file, pst_path=None)
remove observations in an instruction file from the control file
- Parameters:
ins_file (str) – instruction file to remove
pst_path (str) – the path to append to the instruction file in the control file. If not None, then any existing path in front of the instruction is split off and pst_path is prepended. If python is being run in a directory other than where the control file will reside, it is useful to pass pst_path as .. Default is None
- Returns:
the observation data for the observations that were removed.
- Return type:
pandas.DataFrame
Example:
pst = pyemu.Pst(os.path.join("template", "my.pst")) pst.remove_observations(os.path.join("template","some_obs.dat.ins"), pst_path=".") pst.write(os.path.join("template", "my_new_with_less_obs.pst")
- drop_parameters(tpl_file, pst_path=None)
remove parameters in a template file from the control file
- Parameters:
tpl_file (str) – template file to remove
pst_path (str) – the path to append to the template file in the control file. If not None, then any existing path in front of the template or in file is split off and pst_path is prepended. If python is being run in a directory other than where the control file will reside, it is useful to pass pst_path as .. Default is None
- Returns:
the parameter data for the parameters that were removed.
- Return type:
pandas.DataFrame
Note
This method does not check for multiple occurences of the same parameter name(s) in across template files so if you have the same parameter in multiple template files, this is not the method you are looking for
Example:
pst = pyemu.Pst(os.path.join("template", "my.pst")) pst.remove_parameters(os.path.join("template","boring_zone_pars.dat.tpl"), pst_path=".") pst.write(os.path.join("template", "my_new_with_less_pars.pst")
- add_observations(ins_file, out_file=None, pst_path=None, inschek=True)
add new observations to a control file
- Parameters:
ins_file (str) – instruction file with exclusively new observation names
out_file (str) – model output file. If None, then ins_file.replace(“.ins”,””) is used. Default is None
pst_path (str) – the path to append to the instruction file and out file in the control file. If not None, then any existing path in front of the template or in file is split off and pst_path is prepended. If python is being run in a directory other than where the control file will reside, it is useful to pass pst_path as .. Default is None
inschek (bool) – flag to try to process the existing output file using the pyemu.InstructionFile class. If successful, processed outputs are used as obsvals
- Returns:
the data for the new observations that were added
- Return type:
pandas.DataFrame
Note
populates the new observation information with default values
Example:
pst = pyemu.Pst(os.path.join("template", "my.pst")) pst.add_observations(os.path.join("template","new_obs.dat.ins"), pst_path=".") pst.write(os.path.join("template", "my_new.pst")
- write_input_files(pst_path='.')
writes model input files using template files and current parval1 values.
- Parameters:
pst_path (str) – the path to where control file and template files reside. Default is ‘.’
Note
adds “parval1_trans” column to Pst.parameter_data that includes the effect of scale and offset
Example:
pst = pyemu.Pst("my.pst") # load final parameter values pst.parrep("my.par") # write new model input files with final parameter values pst.write_input_files()
- process_output_files(pst_path='.')
processing the model output files using the instruction files and existing model output files.
- Parameters:
pst_path (str) – relative path from where python is running to where the control file, instruction files and model output files are located. Default is “.” (current python directory)
- Returns:
model output values
- Return type:
pandas.Series
Note
requires a complete set of model input files at relative path from where python is running to pst_path
Example:
pst = pyemu.Pst("pest.pst") obsvals = pst.process_output_files() print(obsvals)
- get_res_stats(nonzero=True)
get some common residual stats by observation group.
- Parameters:
nonzero (bool) – calculate stats using only nonzero-weighted observations. This may seem obsvious to most users, but you never know….
- Returns:
a dataframe with columns for groups names and indices of statistic name.
- Return type:
pd.DataFrame
Note
Stats are derived from the current obsvals, weights and grouping in Pst.observation_data and the modelled values in Pst.res. The key here is ‘current’ because if obsval, weights and/or groupings have changed in Pst.observation_data since the residuals file was generated then the current values for obsval, weight and group are used
the normalized RMSE is normalized against the obsval range (max - min)
Example:
pst = pyemu.Pst("pest.pst") stats_df = pst.get_res_stats() print(stats_df.loc["mae",:])
- static _stats_rss(df)
- static _stats_mean(df)
- static _stats_mae(df)
- static _stats_rmse(df)
- static _stats_nrmse(df)
- plot(kind=None, **kwargs)
method to plot various parts of the control. This is sweet as!
- Parameters:
kind (str) – options are ‘prior’ (prior parameter histograms, ‘1to1’ (line of equality and sim vs res), ‘obs_v_sim’ (time series using datetime suffix), ‘phi_pie’ (pie chart of phi components)
kwargs (dict) – optional args for plots that are passed to pyemu plot helpers and ultimately to matplotlib
Note
Depending on ‘kind’ argument, a multipage pdf is written
Example:
pst = pyemu.Pst("my.pst") pst.plot(kind="1to1") # requires Pst.res pst.plot(kind="prior") pst.plot(kind="phi_pie")
- write_par_summary_table(filename=None, group_names=None, sigma_range=4.0, report_in_linear_space=False)
write a stand alone parameter summary latex table or Excel sheet
- Parameters:
filename (str) – filename. If None, use <case>.par.tex to write as LaTeX. If filename extention is ‘.xls’ or ‘.xlsx’, tries to write as an Excel file. If filename is “none”, no table is written Default is None
group_names (dict) – par group names : table names. For example {“w0”:”well stress period 1”}. Default is None
sigma_range (float) – number of standard deviations represented by parameter bounds. Default is 4.0, implying 95% confidence bounds
report_in_linear_space (bool) – flag, if True, that reports all logtransformed values in linear space. This renders standard deviation meaningless, so that column is skipped
- Returns:
the summary parameter group dataframe
- Return type:
pandas.DataFrame
Example:
pst = pyemu.Pst("my.pst") pst.write_par_summary_table(filename="par.tex")
- write_obs_summary_table(filename=None, group_names=None)
- write a stand alone observation summary latex table or Excel shet
- filename (str): filename. If None, use <case>.par.tex to write as LaTeX. If filename extention is ‘.xls’ or ‘.xlsx’,
tries to write as an Excel file. If filename is “none”, no table is written Default is None
- Parameters:
filename (str) – filename. If filename is “none”, no table is written. If None, use <case>.obs.tex. If filename extention is ‘.xls’ or ‘.xlsx’, tries to write as an Excel file. Default is None
group_names (dict) – obs group names : table names. For example {“hds”:”simulated groundwater level”}. Default is None
- Returns:
the summary observation group dataframe
- Return type:
pandas.DataFrame
Example:
pst = pyemu.Pst("my.pst") pst.write_obs_summary_table(filename="obs.tex")
- get_par_change_limits()
calculate the various parameter change limits used in pest.
- Returns:
a copy of Pst.parameter_data with columns for relative and factor change limits
- Return type:
pandas.DataFrame
Note
does not yet support absolute parameter change limits!
Works in control file values space (not log transformed space). Also adds columns for effective upper and lower which account for par bounds and the value of parchglim
example:
pst = pyemu.Pst("pest.pst") df = pst.get_par_change_limits() print(df.chg_lower)
- get_adj_pars_at_bounds(frac_tol=0.01)
get list of adjustable parameter at/near bounds
- Parameters:
frac_tol ('float`) – fractional tolerance of distance to bound. For upper bound, the value parubnd * (1-frac_tol) is used, lower bound uses parlbnd * (1.0 + frac_tol)
- Returns:
[`str`]: list of parameters at/near lower bound
[`str`]: list of parameters at/near upper bound
- Return type:
tuple containing
Example:
pst = pyemu.Pst("pest.pst") at_lb,at_ub = pst.get_adj_pars_at_bounds() print("pars at lower bound",at_lb)
- try_parse_name_metadata()
try to add meta data columns to parameter and observation data based on item names. Used with the PstFrom process.
Note
metadata is identified in key-value pairs that are separated by a colon. each key-value pair is separated from others by underscore
This works with PstFrom style long names
This method is called programmtically during Pst.load()
- rename_parameters(name_dict, pst_path='.', tplmap=None)
rename parameters in the control and template files
- Parameters:
name_dict (dict) – mapping of current to new names.
pst_path (str) – the path to the control file from where python is running. Default is “.” (python is running in the same directory as the control file)
Note
no attempt is made to maintain the length of the marker strings in the template files, so if your model is sensitive to changes in spacing in the template file(s), this is not a method for you
This does a lot of string compare, so its gonna be slow as…
Example:
pst = pyemu.Pst(os.path.join("template","pest.pst")) name_dict = {"par1":"par1_better_name"} pst.rename_parameters(name_dict,pst_path="template")
- rename_observations(name_dict, pst_path='.', insmap=None)
rename observations in the control and instruction files
- Parameters:
name_dict (dict) – mapping of current to new names.
pst_path (str) – the path to the control file from where python is running. Default is “.” (python is running in the same directory as the control file)
Note
This does a lot of string compare, so its gonna be slow as…
Example:
pst = pyemu.Pst(os.path.join("template","pest.pst")) name_dict = {"obs1":"obs1_better_name"} pst.rename_observations(name_dict,pst_path="template")
- class pyemu.Schur(jco, **kwargs)
Bases:
pyemu.la.LinearAnalysis
FOSM-based uncertainty and data-worth analysis
- Parameters:
jco (varies, optional) – something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.
pst (varies, optional) – something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)
parcov (varies, optional) – prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance
obscov (varies, optional) – observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (“.jcb”/”.jco” for binary, “.cov”/”.mat” for PEST-style ASCII matrix, or “.unc” for uncertainty files). If None, the noise covariance matrix is constructed from the obsevation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance
forecasts (varies, optional) – forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection. Note if the PEST++ option “++forecasts()” is set in the pest control file (under the pyemu.Pst.pestpp_options dictionary), then there is no need to pass this argument (unless you want to analyze different forecasts) of pyemu.Matrix or numpy.ndarray.
ref_var (float, optional) – reference variance. Default is 1.0
verbose (bool) – controls screen output. If str, a filename is assumed and and log file is written.
sigma_range (float, optional) – defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.
scale_offset (bool, optional) – flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.
Note
This class is the primary entry point for FOSM-based uncertainty and dataworth analyses
This class replicates and extends the behavior of the PEST PREDUNC utilities.
Example:
#assumes "my.pst" exists sc = pyemu.Schur(jco="my.jco",forecasts=["fore1","fore2"]) print(sc.get_forecast_summary()) print(sc.get_parameter_contribution())
- property posterior_parameter
posterior parameter covariance matrix.
- Returns:
the posterior parameter covariance matrix
- Return type:
pyemu.Cov
Example:
sc = pyemu.Schur(jco="my.jcb") post_cov = sc.posterior_parameter post_cov.to_ascii("post.cov")
- property posterior_forecast
posterior forecast (e.g. prediction) variance(s)
- Returns:
dictionary of forecast names and FOSM-estimated posterior variances
- Return type:
dict
Note
Sames as LinearAnalysis.posterior_prediction
See Schur.get_forecast_summary() for a dataframe-based container of prior and posterior variances
- property posterior_prediction
posterior prediction (e.g. forecast) variance estimate(s)
- Returns:
dictionary of forecast names and FOSM-estimated posterior variances
- Return type:
dict
- Note:
sames as LinearAnalysis.posterior_forecast
See Schur.get_forecast_summary() for a dataframe-based container of prior and posterior variances
- get_parameter_summary()
summary of the FOSM-based parameter uncertainty (variance) estimate(s)
- Returns:
dataframe of prior,posterior variances and percent uncertainty reduction of each parameter
- Return type:
pandas.DataFrame
Note
This is the primary entry point for accessing parameter uncertainty estimates
The “Prior” column in dataframe is the diagonal of LinearAnalysis.parcov “precent_reduction” column in dataframe is calculated as 100.0 * (1.0 - (posterior variance / prior variance)
Example:
sc = pyemu.Schur(jco="my.jcb",forecasts=["fore1","fore2"]) df = sc.get_parameter_summary() df.loc[:,["prior","posterior"]].plot(kind="bar") plt.show() df.percent_reduction.plot(kind="bar") plt.show()
- get_forecast_summary()
summary of the FOSM-based forecast uncertainty (variance) estimate(s)
- Returns:
dataframe of prior,posterior variances and percent uncertainty reduction of each forecast (e.g. prediction)
- Return type:
pandas.DataFrame
Note
This is the primary entry point for accessing forecast uncertainty estimates “precent_reduction” column in dataframe is calculated as 100.0 * (1.0 - (posterior variance / prior variance)
Example:
sc = pyemu.Schur(jco="my.jcb",forecasts=["fore1","fore2"]) df = sc.get_parameter_summary() df.loc[:,["prior","posterior"]].plot(kind="bar") plt.show() df.percent_reduction.plot(kind="bar") plt.show()
- __contribution_from_parameters(parameter_names)
private method get the prior and posterior uncertainty reduction as a result of some parameter becoming perfectly known
- get_conditional_instance(parameter_names)
get a new pyemu.Schur instance that includes conditional update from some parameters becoming known perfectly
- Parameters:
parameter_names ([str]) – list of parameters that are to be treated as notionally perfectly known
- Returns:
a new Schur instance conditional on perfect knowledge of some parameters. The new instance has an updated parcov that is less the names listed in parameter_names.
- Return type:
pyemu.Schur
Note
This method is primarily for use by the LinearAnalysis.get_parameter_contribution() dataworth method.
- get_par_contribution(parlist_dict=None, include_prior_results=False)
A dataworth method to get a dataframe the prior and posterior uncertainty reduction as a result of some parameter becoming perfectly known
- Parameters:
parlist_dict – (dict, optional): a nested dictionary-list of groups of parameters that are to be treated as perfectly known. key values become row labels in returned dataframe. If None, each adjustable parameter is sequentially treated as known and the returned dataframe has row labels for each adjustable parameter
include_prior_results (bool, optional) – flag to return a multi-indexed dataframe with both conditional prior and posterior forecast uncertainty estimates. This is because the notional learning about parameters potentially effects both the prior and posterior forecast uncertainty estimates. If False, only posterior results are returned. Default is False
- Returns:
a dataframe that summarizes the parameter contribution dataworth analysis. The dataframe has index (row labels) of the keys in parlist_dict and a column labels of forecast names. The values in the dataframe are the posterior variance of the forecast conditional on perfect knowledge of the parameters in the values of parlist_dict. One row in the dataframe will be labeled base - this is the forecast uncertainty estimates that include the effects of all adjustable parameters. Percent decreases in forecast uncertainty can be calculated by differencing all rows against the “base” row. Varies depending on include_prior_results.
- Return type:
pandas.DataFrame
Note
This is the primary dataworth method for assessing the contribution of one or more parameters to forecast uncertainty.
Example:
sc = pyemu.Schur(jco="my.jco") parlist_dict = {"hk":["hk1","hk2"],"rech"["rech1","rech2"]} df = sc.get_par_contribution(parlist_dict=parlist_dict)
- get_par_group_contribution(include_prior_results=False)
A dataworth method to get the forecast uncertainty contribution from each parameter group
- Parameters:
include_prior_results (bool, optional) – flag to return a multi-indexed dataframe with both conditional prior and posterior forecast uncertainty estimates. This is because the notional learning about parameters potentially effects both the prior and posterior forecast uncertainty estimates. If False, only posterior results are returned. Default is False
- Returns:
a dataframe that summarizes the parameter contribution analysis. The dataframe has index (row labels) that are the parameter group names and a column labels of forecast names. The values in the dataframe are the posterior variance of the forecast conditional on perfect knowledge of the adjustable parameters in each parameter group. One row is labelled “base” - this is the variance of the forecasts that includes the effects of all adjustable parameters. Varies depending on include_prior_results.
- Return type:
pandas.DataFrame
Note
This method is just a thin wrapper around get_contribution_dataframe() - this method automatically constructs the parlist_dict argument where the keys are the group names and the values are the adjustable parameters in the groups
Example:
sc = pyemu.Schur(jco="my.jco") df = sc.get_par_group_contribution()
- get_added_obs_importance(obslist_dict=None, base_obslist=None, reset_zero_weight=1.0)
- A dataworth method to analyze the posterior uncertainty as a result of gathering
some additional observations
- Parameters:
obslist_dict (dict, optional) – a nested dictionary-list of groups of observations that are to be treated as gained/collected. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None
base_obslist ([str], optional) – observation names to treat as the “existing” observations. The values of obslist_dict will be added to this list during each test. If None, then the values in each obslist_dict entry will be treated as the entire calibration dataset. That is, there are no existing observations. Default is None. Standard practice would be to pass this argument as pyemu.Schur.pst.nnz_obs_names so that existing, non-zero-weighted observations are accounted for in evaluating the worth of new yet-to-be-collected observations.
reset_zero_weight (float, optional) – a flag to reset observations with zero weight in obslist_dict If reset_zero_weights passed as 0.0, no weights adjustments are made. Default is 1.0.
- Returns:
a dataframe with row labels (index) of obslist_dict.keys() and columns of forecast names. The values in the dataframe are the posterior variance of the forecasts resulting from notional inversion using the observations in obslist_dict[key value] plus the observations in base_obslist (if any). One row in the dataframe is labeled “base” - this is posterior forecast variance resulting from the notional calibration with the observations in base_obslist (if base_obslist is None, then the “base” row is the prior forecast variance). Conceptually, the forecast variance should either not change or decrease as a result of gaining additional observations. The magnitude of the decrease represents the worth of the potential new observation(s) being tested.
- Return type:
pandas.DataFrame
Note
Observations listed in base_obslist is required to only include observations with weight not equal to zero. If zero-weighted observations are in base_obslist an exception will be thrown. In most cases, users will want to reset zero-weighted observations as part dataworth testing process. If reset_zero_weights == 0, no weights adjustments will be made - this is most appropriate if different weights are assigned to the added observation values in Schur.pst
Example:
sc = pyemu.Schur("my.jco") obslist_dict = {"hds":["head1","head2"],"flux":["flux1","flux2"]} df = sc.get_added_obs_importance(obslist_dict=obslist_dict, base_obslist=sc.pst.nnz_obs_names, reset_zero_weight=1.0)
- get_removed_obs_importance(obslist_dict=None, reset_zero_weight=None)
- A dataworth method to analyze the posterior uncertainty as a result of losing
some existing observations
- Parameters:
obslist_dict (dict, optional) – a nested dictionary-list of groups of observations that are to be treated as lost. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None
DEPRECATED (reset_zero_weight) –
- Returns:
A dataframe with index of obslist_dict.keys() and columns of forecast names. The values in the dataframe are the posterior variances of the forecasts resulting from losing the information contained in obslist_dict[key value]. One row in the dataframe is labeled “base” - this is posterior forecast variance resulting from the notional calibration with the non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should either not change or increase as a result of losing existing observations. The magnitude of the increase represents the worth of the existing observation(s) being tested.
Note: All observations that may be evaluated as removed must have non-zero weight
- Return type:
pandas.DataFrame
Example:
sc = pyemu.Schur("my.jco") df = sc.get_removed_obs_importance()
- get_obs_group_dict()
get a dictionary of observations grouped by observation group name
- Returns:
a dictionary of observations grouped by observation group name. Useful for dataworth processing in pyemu.Schur
- Return type:
dict
Note
only includes observations that are listed in Schur.jco.row_names
Example:
sc = pyemu.Schur("my.jco") obsgrp_dict = sc.get_obs_group_dict() df = sc.get_removed_obs_importance(obsgrp_dict=obsgrp_dict)
- get_removed_obs_group_importance(reset_zero_weight=None)
- A dataworth method to analyze the posterior uncertainty as a result of losing
existing observations, tested by observation groups
- Parameters:
DEPRECATED (reset_zero_weight) –
- Returns:
A dataframe with index of observation group names and columns of forecast names. The values in the dataframe are the posterior variances of the forecasts resulting from losing the information contained in each observation group. One row in the dataframe is labeled “base” - this is posterior forecast variance resulting from the notional calibration with the non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should either not change or increase as a result of losing existing observations. The magnitude of the increase represents the worth of the existing observation(s) in each group being tested.
- Return type:
pandas.DataFrame
Example:
sc = pyemu.Schur("my.jco") df = sc.get_removed_obs_group_importance()
- get_added_obs_group_importance(reset_zero_weight=1.0)
- A dataworth method to analyze the posterior uncertainty as a result of gaining
existing observations, tested by observation groups
- Parameters:
reset_zero_weight (float, optional) – a flag to reset observations with zero weight in obslist_dict If reset_zero_weights passed as 0.0, no weights adjustments are made. Default is 1.0.
- Returns:
A dataframe with index of observation group names and columns of forecast names. The values in the dataframe are the posterior variances of the forecasts resulting from gaining the information contained in each observation group. One row in the dataframe is labeled “base” - this is posterior forecast variance resulting from the notional calibration with the non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should either not change or decrease as a result of gaining new observations. The magnitude of the decrease represents the worth of the potential new observation(s) in each group being tested.
- Return type:
pandas.DataFrame
Note
Observations in Schur.pst with zero weights are not included in the analysis unless reset_zero_weight is a float greater than zero. In most cases, users will want to reset zero-weighted observations as part dataworth testing process.
Example:
sc = pyemu.Schur("my.jco") df = sc.get_added_obs_group_importance(reset_zero_weight=1.0)
- next_most_important_added_obs(forecast=None, niter=3, obslist_dict=None, base_obslist=None, reset_zero_weight=1.0)
find the most important observation(s) by sequentially evaluating the importance of the observations in obslist_dict.
- Parameters:
forecast (str, optional) – name of the forecast to use in the ranking process. If more than one forecast has been listed, this argument is required. This is because the data worth must be ranked with respect to the variance reduction for a single forecast
niter (int, optional) – number of sequential dataworth testing iterations. Default is 3
obslist_dict (dict, optional) – a nested dictionary-list of groups of observations that are to be treated as gained/collected. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None
base_obslist ([str], optional) – observation names to treat as the “existing” observations. The values of obslist_dict will be added to this list during each test. If None, then the values in each obslist_dict entry will be treated as the entire calibration dataset. That is, there are no existing observations. Default is None. Standard practice would be to pass this argument as pyemu.Schur.pst.nnz_obs_names so that existing, non-zero-weighted observations are accounted for in evaluating the worth of new yet-to-be-collected observations.
reset_zero_weight (float, optional) – a flag to reset observations with zero weight in obslist_dict If reset_zero_weights passed as 0.0, no weights adjustments are made. Default is 1.0.
- Returns:
a dataFrame with columns of obslist_dict key for each iteration the yields the largest variance reduction for the named forecast. Columns are forecast variance percent reduction for each iteration (percent reduction compared to initial “base” case with all non-zero weighted observations included in the notional calibration)
- Return type:
pandas.DataFrame
Note
The most important observations from each iteration is added to base_obslist and removed obslist_dict for the next iteration. In this way, the added observation importance values include the conditional information from the last iteration.
Example:
sc = pyemu.Schur(jco="my.jco") df = sc.next_most_important_added_obs(forecast="fore1",base_obslist=sc.pst.nnz_obs_names)
- next_most_par_contribution(niter=3, forecast=None, parlist_dict=None)
find the parameter(s) contributing most to posterior forecast by sequentially evaluating the contribution of parameters in parlist_dict.
- Parameters:
forecast (str, optional) – name of the forecast to use in the ranking process. If more than one forecast has been listed, this argument is required. This is because the data worth must be ranked with respect to the variance reduction for a single forecast
niter (int, optional) – number of sequential dataworth testing iterations. Default is 3
parlist_dict (dict, optional) – dict a nested dictionary-list of groups of parameters that are to be treated as perfectly known. key values become row labels in dataframe
parlist_dict – a nested dictionary-list of groups of parameters that are to be treated as perfectly known (zero uncertainty). key values become row labels in returned dataframe. If None, then every adustable parameter is tested sequentially. Default is None. Conceptually, the forecast variance should either not change or decrease as a result of knowing parameter perfectly. The magnitude of the decrease represents the worth of gathering information about the parameter(s) being tested.
Note
The largest contributing parameters from each iteration are treated as known perfectly for the remaining iterations. In this way, the next iteration seeks the next most influential group of parameters.
- Returns:
a dataframe with index of iteration number and columns of parlist_dict.keys(). The values are the results of the knowing each parlist_dict entry expressed as posterior variance reduction
- Return type:
pandas.DataFrame