Skip to content

sc

module for FOSM-based uncertainty analysis using a linearized form of Bayes equation known as the Schur compliment

Schur

Bases: LinearAnalysis

FOSM-based uncertainty and data-worth analysis

Parameters:

Name Type Description Default
jco varies

something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.

required
pst varies

something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)

required
parcov varies

prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (".jcb"/".jco" for binary, ".cov"/".mat" for PEST-style ASCII matrix, or ".unc" for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance

required
obscov varies

observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (".jcb"/".jco" for binary, ".cov"/".mat" for PEST-style ASCII matrix, or ".unc" for uncertainty files). If None, the noise covariance matrix is constructed from the observation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance

required
forecasts varies

forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection. Note if the PEST++ option "++forecasts()" is set in the pest control file (under the pyemu.Pst.pestpp_options dictionary), then there is no need to pass this argument (unless you want to analyze different forecasts) of pyemu.Matrix or numpy.ndarray.

required
ref_var float

reference variance. Default is 1.0

required
verbose `bool`

controls screen output. If str, a filename is assumed and and log file is written.

required
sigma_range `float`

defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.

required
scale_offset `bool`

flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.

required
Note

This class is the primary entry point for FOSM-based uncertainty and dataworth analyses

This class replicates and extends the behavior of the PEST PREDUNC utilities.

Example::

#assumes "my.pst" exists
sc = pyemu.Schur(jco="my.jco",forecasts=["fore1","fore2"])
print(sc.get_forecast_summary())
print(sc.get_parameter_contribution())

adj_par_names property

adjustable parameter names

Returns:

Type Description

['str`]: list of adjustable parameter names

Note

if LinearAnalysis.pst is None, returns LinearAnalysis.jco.col_names

fehalf property

Karhunen-Loeve scaling matrix attribute.

Returns:

Type Description

pyemu.Matrix: the Karhunen-Loeve scaling matrix based on the prior

parameter covariance matrix

forecast_names property

get the forecast (aka prediction) names

Returns:

Type Description
[`str`]

list of forecast names

forecasts property

the forecast sentivity vectors attribute

Returns:

Type Description

pyemu.Matrix: a matrix of forecast (prediction) sensitivity vectors (column wise)

forecasts_iter property

forecast (e.g. prediction) sensitivity vectors iterator

Returns:

Type Description

iterator: iterator on forecasts (e.g. predictions) sensitivity vectors (matrix)

Note

This is used for processing huge numbers of predictions

This is a synonym for LinearAnalysis.predictions_iter()

jco property

the jacobian matrix attribute

Returns:

Type Description

pyemu.Jco: the jacobian matrix attribute

mle_covariance property

maximum likelihood parameter covariance matrix.

Returns:

Type Description

pyemu.Matrix: maximum likelihood parameter covariance matrix

mle_parameter_estimate property

maximum likelihood parameter estimate.

Returns:

Type Description

pandas.Series: the maximum likelihood parameter estimates

nnz_obs_names property

non-zero-weighted observation names

Returns:

Type Description

['str`]: list of non-zero-weighted observation names

Note

if LinearAnalysis.pst is None, returns LinearAnalysis.jco.row_names

obscov property

get the observation noise covariance matrix attribute

Returns:

Type Description

pyemu.Cov: a reference to the LinearAnalysis.obscov attribute

parcov property

get the prior parameter covariance matrix attribute

Returns:

Type Description

pyemu.Cov: a reference to the LinearAnalysis.parcov attribute

posterior_forecast property

posterior forecast (e.g. prediction) variance(s)

Returns:

Type Description

dict: dictionary of forecast names and FOSM-estimated posterior

variances

Note

Sames as LinearAnalysis.posterior_prediction

See Schur.get_forecast_summary() for a dataframe-based container of prior and posterior variances

posterior_parameter property

posterior parameter covariance matrix.

Returns:

Type Description

pyemu.Cov: the posterior parameter covariance matrix

Example::

sc = pyemu.Schur(jco="my.jcb")
post_cov = sc.posterior_parameter
post_cov.to_ascii("post.cov")

posterior_prediction property

posterior prediction (e.g. forecast) variance estimate(s)

Returns:

Type Description

dict: dictionary of forecast names and FOSM-estimated posterior

variances

Note: sames as LinearAnalysis.posterior_forecast

 See `Schur.get_forecast_summary()` for a dataframe-based container of prior and posterior
 variances

predictions property

the prediction (aka forecast) sentivity vectors attribute

Returns:

Type Description

pyemu.Matrix: a matrix of prediction sensitivity vectors (column wise)

predictions_iter property

prediction sensitivity vectors iterator

Returns:

Type Description

iterator: iterator on prediction sensitivity vectors (matrix)

Note

this is used for processing huge numbers of predictions

prior_forecast property

prior forecast (e.g. prediction) variances

Returns:

Type Description

dict: a dictionary of forecast name, prior variance pairs

prior_parameter property

prior parameter covariance matrix

Returns:

Type Description

pyemu.Cov: prior parameter covariance matrix

prior_prediction property

prior prediction (e.g. forecast) variances

Returns:

Type Description

dict: a dictionary of prediction name, prior variance pairs

pst property

the pst attribute

Returns:

Type Description

pyemu.Pst: the pst attribute

qhalf property

square root of the cofactor matrix attribute. Create the attribute if it has not yet been created

Returns:

Type Description

pyemu.Matrix: square root of the cofactor matrix

qhalfx property

half normal matrix attribute.

Returns:

Type Description

pyemu.Matrix: half normal matrix attribute

xtqx property

normal matrix attribute.

Returns:

Type Description

pyemu.Matrix: normal matrix attribute

__contribution_from_parameters(parameter_names)

private method get the prior and posterior uncertainty reduction as a result of some parameter becoming perfectly known

__fromfile(filename, astype=None)

a private method to deduce and load a filename into a matrix object. Uses extension: 'jco' or 'jcb': binary, 'mat','vec' or 'cov': ASCII, 'unc': pest uncertainty file.

__load_jco()

private method to set the jco attribute from a file or a matrix object

__load_obscov()

private method to set the obscov attribute from: a pest control file (observation weights) a pst object a matrix object an uncert file an ascii matrix file

__load_parcov()

private method to set the parcov attribute from: a pest control file (parameter bounds) a pst object a matrix object an uncert file an ascii matrix file

__load_predictions()

private method set the predictions attribute from

mixed list of row names, matrix files and ndarrays a single row name an ascii file

can be none if only interested in parameters.

__load_pst()

private method set the pst attribute

adjust_obscov_resfile(resfile=None)

reset the elements of obscov by scaling the implied weights based on the phi components in res_file so that the total phi is equal to the number of non-zero weights.

Parameters:

Name Type Description Default
resfile `str`

residual file to use. If None, residual file with case name is sought. default is None

None
Note

calls pyemu.Pst.adjust_weights_resfile()

apply_karhunen_loeve_scaling()

apply karhuene-loeve scaling to the jacobian matrix.

Note:

This scaling is not necessary for analyses using Schur's
complement, but can be very important for error variance
analyses.  This operation effectively transfers prior knowledge
specified in the parcov to the jacobian and reset parcov to the
identity matrix.

clean()

drop regularization and prior information observation from the jco

drop_prior_information()

drop the prior information from the jco and pst attributes

get(par_names=None, obs_names=None, astype=None)

method to get a new LinearAnalysis class using a subset of parameters and/or observations

Parameters:

Name Type Description Default
par_names [`'str`]

par names for new object

None
obs_names [`'str`]

obs names for new object

None
astype `pyemu.Schur` or `pyemu.ErrVar`

type to cast the new object. If None, return type is same as self

None

Returns:

Type Description

LinearAnalysis: new instance

get_added_obs_group_importance(reset_zero_weight=1.0)

A dataworth method to analyze the posterior uncertainty as a result of gaining existing observations, tested by observation groups

Returns:

Type Description

pandas.DataFrame: A dataframe with index of observation group names and columns

of forecast names. The values in the dataframe are the posterior

variances of the forecasts resulting from gaining the information

contained in each observation group. One row in the dataframe is labeled "base" - this is

posterior forecast variance resulting from the notional calibration with the

non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should

either not change or decrease as a result of gaining new observations. The magnitude

of the decrease represents the worth of the potential new observation(s) in each group

being tested.

Note

Observations in Schur.pst with zero weights are not included in the analysis unless reset_zero_weight is a float greater than zero. In most cases, users will want to reset zero-weighted observations as part dataworth testing process.

Example::

sc = pyemu.Schur("my.jco")
df = sc.get_added_obs_group_importance(reset_zero_weight=1.0)

get_added_obs_importance(obslist_dict=None, base_obslist=None, reset_zero_weight=1.0)

A dataworth method to analyze the posterior uncertainty as a result of gathering some additional observations

Parameters:

Name Type Description Default
obslist_dict `dict`

a nested dictionary-list of groups of observations that are to be treated as gained/collected. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None

None
base_obslist [`str`]

observation names to treat as the "existing" observations. The values of obslist_dict will be added to this list during each test. If None, then the values in each obslist_dict entry will be treated as the entire calibration dataset. That is, there are no existing observations. Default is None. Standard practice would be to pass this argument as pyemu.Schur.pst.nnz_obs_names so that existing, non-zero-weighted observations are accounted for in evaluating the worth of new yet-to-be-collected observations.

None

Returns:

Type Description

pandas.DataFrame: a dataframe with row labels (index) of obslist_dict.keys() and

columns of forecast names. The values in the dataframe are the

posterior variance of the forecasts resulting from notional inversion

using the observations in obslist_dict[key value] plus the observations

in base_obslist (if any). One row in the dataframe is labeled "base" - this is

posterior forecast variance resulting from the notional calibration with the

observations in base_obslist (if base_obslist is None, then the "base" row is the

prior forecast variance). Conceptually, the forecast variance should either not change or

decrease as a result of gaining additional observations. The magnitude of the decrease

represents the worth of the potential new observation(s) being tested.

Note

Observations listed in base_obslist is required to only include observations with weight not equal to zero. If zero-weighted observations are in base_obslist an exception will be thrown. In most cases, users will want to reset zero-weighted observations as part dataworth testing process. If reset_zero_weights == 0, no weights adjustments will be made - this is most appropriate if different weights are assigned to the added observation values in Schur.pst

Example::

sc = pyemu.Schur("my.jco")
obslist_dict = {"hds":["head1","head2"],"flux":["flux1","flux2"]}
df = sc.get_added_obs_importance(obslist_dict=obslist_dict,
                                 base_obslist=sc.pst.nnz_obs_names,
                                 reset_zero_weight=1.0)

get_conditional_instance(parameter_names)

get a new pyemu.Schur instance that includes conditional update from some parameters becoming known perfectly

Parameters:

Name Type Description Default
parameter_names [`str`]

list of parameters that are to be treated as notionally perfectly known

required

Returns:

Type Description

pyemu.Schur: a new Schur instance conditional on perfect knowledge

of some parameters. The new instance has an updated parcov that is less

the names listed in parameter_names.

Note

This method is primarily for use by the LinearAnalysis.get_parameter_contribution() dataworth method.

get_cso_dataframe()

get a dataframe of composite observation sensitivity, as returned by PEST in the seo file.

Returns:

Type Description

pandas.DataFrame: dataframe of observation names and composite observation

sensitivity

Note

That this formulation deviates slightly from the PEST documentation in that the values are divided by (npar-1) rather than by (npar).

The equation is cso_j = ((Q^1/2JJ^T*Q^1/2)^1/2)_jj/(NPAR-1)

get_forecast_summary()

summary of the FOSM-based forecast uncertainty (variance) estimate(s)

Returns:

Type Description

pandas.DataFrame: dataframe of prior,posterior variances and percent

uncertainty reduction of each forecast (e.g. prediction)

Note

This is the primary entry point for accessing forecast uncertainty estimates "precent_reduction" column in dataframe is calculated as 100.0 * (1.0 - (posterior variance / prior variance)

Example::

sc = pyemu.Schur(jco="my.jcb",forecasts=["fore1","fore2"])
df = sc.get_parameter_summary()
df.loc[:,["prior","posterior"]].plot(kind="bar")
plt.show()
df.percent_reduction.plot(kind="bar")
plt.show()

get_obs_competition_dataframe()

get the observation competition stat a la PEST utility

Returns:

Type Description

pandas.DataFrame: a dataframe of observation names by

observation names with values equal to the PEST

competition statistic

get_obs_group_dict()

get a dictionary of observations grouped by observation group name

Returns:

Type Description

dict: a dictionary of observations grouped by observation group name.

Useful for dataworth processing in pyemu.Schur

Note

only includes observations that are listed in Schur.jco.row_names

Example::

sc = pyemu.Schur("my.jco")
obsgrp_dict = sc.get_obs_group_dict()
df = sc.get_removed_obs_importance(obsgrp_dict=obsgrp_dict)

get_par_contribution(parlist_dict=None, include_prior_results=False)

A dataworth method to get a dataframe the prior and posterior uncertainty reduction as a result of some parameter becoming perfectly known

Parameters:

Name Type Description Default
parlist_dict

(dict, optional): a nested dictionary-list of groups of parameters that are to be treated as perfectly known. key values become row labels in returned dataframe. If None, each adjustable parameter is sequentially treated as known and the returned dataframe has row labels for each adjustable parameter

required
include_prior_results `bool`

flag to return a multi-indexed dataframe with both conditional prior and posterior forecast uncertainty estimates. This is because the notional learning about parameters potentially effects both the prior and posterior forecast uncertainty estimates. If False, only posterior results are returned. Default is False

False

Returns:

Type Description

pandas.DataFrame: a dataframe that summarizes the parameter contribution

dataworth analysis. The dataframe has index (row labels) of the keys in parlist_dict

and a column labels of forecast names. The values in the dataframe

are the posterior variance of the forecast conditional on perfect

knowledge of the parameters in the values of parlist_dict. One row in the

dataframe will be labeled base - this is the forecast uncertainty estimates

that include the effects of all adjustable parameters. Percent decreases in

forecast uncertainty can be calculated by differencing all rows against the

"base" row. Varies depending on include_prior_results.

Note

This is the primary dataworth method for assessing the contribution of one or more parameters to forecast uncertainty.

Example::

sc = pyemu.Schur(jco="my.jco")
parlist_dict = {"hk":["hk1","hk2"],"rech"["rech1","rech2"]}
df = sc.get_par_contribution(parlist_dict=parlist_dict)

get_par_css_dataframe()

get a dataframe of composite scaled sensitivities. Includes both PEST-style and Hill-style.

Returns:

Type Description

pandas.DataFrame: a dataframe of parameter names, PEST-style and

Hill-style composite scaled sensitivity

get_par_group_contribution(include_prior_results=False)

A dataworth method to get the forecast uncertainty contribution from each parameter group

Parameters:

Name Type Description Default
include_prior_results `bool`

flag to return a multi-indexed dataframe with both conditional prior and posterior forecast uncertainty estimates. This is because the notional learning about parameters potentially effects both the prior and posterior forecast uncertainty estimates. If False, only posterior results are returned. Default is False

False

Returns:

`pandas.DataFrame`: a dataframe that summarizes the parameter contribution analysis.
The dataframe has index (row labels) that are the parameter group names
and a column labels of forecast names.  The values in the dataframe
are the posterior variance of the forecast conditional on perfect
knowledge of the adjustable parameters in each parameter group.  One
row is labelled "base" - this is the variance of the forecasts that includes
the effects of all adjustable parameters. Varies depending on `include_prior_results`.
Note

This method is just a thin wrapper around get_contribution_dataframe() - this method automatically constructs the parlist_dict argument where the keys are the group names and the values are the adjustable parameters in the groups

Example::

sc = pyemu.Schur(jco="my.jco")
df = sc.get_par_group_contribution()

get_parameter_summary()

summary of the FOSM-based parameter uncertainty (variance) estimate(s)

Returns:

Type Description

pandas.DataFrame: dataframe of prior,posterior variances and percent

uncertainty reduction of each parameter

Note

This is the primary entry point for accessing parameter uncertainty estimates

The "Prior" column in dataframe is the diagonal of LinearAnalysis.parcov "precent_reduction" column in dataframe is calculated as 100.0 * (1.0 - (posterior variance / prior variance)

Example::

sc = pyemu.Schur(jco="my.jcb",forecasts=["fore1","fore2"])
df = sc.get_parameter_summary()
df.loc[:,["prior","posterior"]].plot(kind="bar")
plt.show()
df.percent_reduction.plot(kind="bar")
plt.show()

get_removed_obs_group_importance(reset_zero_weight=None)

A dataworth method to analyze the posterior uncertainty as a result of losing existing observations, tested by observation groups

Returns:

Type Description

pandas.DataFrame: A dataframe with index of observation group names and columns

of forecast names. The values in the dataframe are the posterior

variances of the forecasts resulting from losing the information

contained in each observation group. One row in the dataframe is labeled "base" - this is

posterior forecast variance resulting from the notional calibration with the

non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should

either not change or increase as a result of losing existing observations. The magnitude

of the increase represents the worth of the existing observation(s) in each group being tested.

Example::

sc = pyemu.Schur("my.jco")
df = sc.get_removed_obs_group_importance()

get_removed_obs_importance(obslist_dict=None, reset_zero_weight=None)

A dataworth method to analyze the posterior uncertainty as a result of losing some existing observations

Parameters:

Name Type Description Default
obslist_dict `dict`

a nested dictionary-list of groups of observations that are to be treated as lost. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None

None

Returns:

Name Type Description

pandas.DataFrame: A dataframe with index of obslist_dict.keys() and columns

of forecast names. The values in the dataframe are the posterior

variances of the forecasts resulting from losing the information

contained in obslist_dict[key value]. One row in the dataframe is labeled "base" - this is

posterior forecast variance resulting from the notional calibration with the

non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should

either not change or increase as a result of losing existing observations. The magnitude

of the increase represents the worth of the existing observation(s) being tested.

Note

All observations that may be evaluated as removed must have non-zero weight

Example::

sc = pyemu.Schur("my.jco")
df = sc.get_removed_obs_importance()

next_most_important_added_obs(forecast=None, niter=3, obslist_dict=None, base_obslist=None, reset_zero_weight=1.0)

find the most important observation(s) by sequentially evaluating the importance of the observations in obslist_dict.

Parameters:

Name Type Description Default
forecast `str`

name of the forecast to use in the ranking process. If more than one forecast has been listed, this argument is required. This is because the data worth must be ranked with respect to the variance reduction for a single forecast

None
niter `int`

number of sequential dataworth testing iterations. Default is 3

3
obslist_dict `dict`

a nested dictionary-list of groups of observations that are to be treated as gained/collected. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None

None
base_obslist [`str`]

observation names to treat as the "existing" observations. The values of obslist_dict will be added to this list during each test. If None, then the values in each obslist_dict entry will be treated as the entire calibration dataset. That is, there are no existing observations. Default is None. Standard practice would be to pass this argument as pyemu.Schur.pst.nnz_obs_names so that existing, non-zero-weighted observations are accounted for in evaluating the worth of new yet-to-be-collected observations.

None

Returns:

Type Description

pandas.DataFrame: a dataFrame with columns of obslist_dict key for each iteration

the yields the largest variance reduction for the named forecast. Columns are forecast

variance percent reduction for each iteration (percent reduction compared to initial "base"

case with all non-zero weighted observations included in the notional calibration)

Note

The most important observations from each iteration is added to base_obslist and removed obslist_dict for the next iteration. In this way, the added observation importance values include the conditional information from the last iteration.

Example::

sc = pyemu.Schur(jco="my.jco")
df = sc.next_most_important_added_obs(forecast="fore1",base_obslist=sc.pst.nnz_obs_names)

next_most_par_contribution(niter=3, forecast=None, parlist_dict=None)

find the parameter(s) contributing most to posterior forecast by sequentially evaluating the contribution of parameters in parlist_dict.

Parameters:

Name Type Description Default
forecast `str`

name of the forecast to use in the ranking process. If more than one forecast has been listed, this argument is required. This is because the data worth must be ranked with respect to the variance reduction for a single forecast

None
niter `int`

number of sequential dataworth testing iterations. Default is 3

3
parlist_dict

dict a nested dictionary-list of groups of parameters that are to be treated as perfectly known. key values become row labels in dataframe

required
parlist_dict `dict`

a nested dictionary-list of groups of parameters that are to be treated as perfectly known (zero uncertainty). key values become row labels in returned dataframe. If None, then every adustable parameter is tested sequentially. Default is None. Conceptually, the forecast variance should either not change or decrease as a result of knowing parameter perfectly. The magnitude of the decrease represents the worth of gathering information about the parameter(s) being tested.

None
Note

The largest contributing parameters from each iteration are treated as known perfectly for the remaining iterations. In this way, the next iteration seeks the next most influential group of parameters.

Returns:

Type Description

pandas.DataFrame: a dataframe with index of iteration number and columns

of parlist_dict.keys(). The values are the results of the knowing

each parlist_dict entry expressed as posterior variance reduction

reset_obscov(arg=None)

reset the obscov attribute to None

Parameters:

Name Type Description Default
arg `str` or `pyemu.Matrix`

the value to assign to the obscov attribute. If None, the private __obscov attribute is cleared but not reset

None

reset_parcov(arg=None)

reset the parcov attribute to None

Parameters:

Name Type Description Default
arg `str` or `pyemu.Matrix`

the value to assign to the parcov attribute. If None, the private __parcov attribute is cleared but not reset

None

reset_pst(arg)

reset the LinearAnalysis.pst attribute

Parameters:

Name Type Description Default
arg `str` or `pyemu.Pst`

the value to assign to the pst attribute

required