Skip to content

pyemu

pyEMU: python modules for Environmental Model Uncertainty analyses. These modules are designed to work directly and seamlessly with PEST and PEST++ model independent interface. pyEMU can also be used to setup this interface.

Several forms of uncertainty analyses are support including FOSM-based analyses (pyemu.Schur and pyemu.ErrVar), data worth analyses and high-dimensional ensemble generation.

Cov

Bases: Matrix

Diagonal and/or dense Covariance matrices

Parameters:

Name Type Description Default
x `numpy.ndarray`

numeric values

None
names [`str`]

list of row and column names

[]
isdigonal `bool`

flag if the Matrix is diagonal

required
autoalign `bool`

flag to control the autoalignment of Matrix during linear algebra operations

True

Example::

data = pyemu.en.rng.random((10,10))
names = ["par_{0}".format(i) for i in range(10)]
mat = pyemu.Cov(x=data,names=names)
mat.to_binary("mat.jco")
Note

row_names and col_names args are supported in the constructor so support inheritance. However, users should only pass names

T property

wrapper function for Matrix.transpose() method

Returns:

Type Description

Matrix: transpose of Matrix

Note

returns a copy of self

A syntatic-sugar overload of Matrix.transpose()

Example::

jcb = pyemu.Jco.from_binary("pest.jcb")
jcbt = jcb.T

as_2d property

get a 2D numeric representation of Matrix.x. If not isdiagonal, simply return reference to Matrix.x, otherwise, constructs and returns a 2D, diagonal ndarray

Returns:

Type Description

numpy.ndarray : numpy.ndarray

Example::

# A diagonal cov
cov = pyemu.Cov.from_parameter_data
x2d = cov.as_2d # a numpy 2d array
print(cov.shape,cov.x.shape,x2d.shape)

full_s property

Get the full singular value matrix

Returns:

Type Description

Matrix: full singular value matrix. Shape is (max(Matrix.shape),max(Matrix.shape))

with zeros along the diagonal from min(Matrix.shape) to max(Matrix.shape)

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
jco.full_s.to_ascii("full_sing_val_matrix.mat")

identity property

get an identity Cov of the same shape

Returns:

Type Description

Cov: new Cov instance with identity matrix

Note

the returned identity matrix has the same row-col names as self

inv property

inversion operation of Matrix

Returns:

Type Description

Matrix: inverse of Matrix

Note

uses numpy.linalg.inv for the inversion

Example::

mat = pyemu.Matrix.from_binary("my.jco")
mat_inv = mat.inv
mat_inv.to_binary("my_inv.jco")

names property

wrapper for getting row_names. row_names == col_names for Cov

Returns:

Type Description

[str]: list of names

ncol property

length of second dimension

Returns:

Type Description

int: number of columns

newx property

return a copy of Matrix.x attribute

Returns:

Type Description

numpy.ndarray: a copy Matrix.x

nrow property

length of first dimension

Returns:

Type Description

int: number of rows

s property

the singular value (diagonal) Matrix

Returns:

Type Description

Matrix: singular value matrix. shape is (min(Matrix.shape),min(Matrix.shape))

Example::

# plot the singular spectrum of the jacobian
import matplotlib.pyplot as plt
jco = pyemu.Jco.from_binary("pest.jcb")
plt.plot(jco.s.x)
plt.show()

shape property

get the implied, 2D shape of Matrix

Returns:

Type Description

int: length of 2 tuple

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
shape = jco.shape
nrow,ncol = shape #unpack to ints

sqrt property

element-wise square root operation

Returns:

Type Description

Matrix: element-wise square root of Matrix

Note

uses numpy.sqrt

Example::

cov = pyemu.Cov.from_uncfile("my.unc")
sqcov = cov.sqrt
sqcov.to_uncfile("sqrt_cov.unc")

transpose property

transpose operation of self

Returns:

Type Description

Matrix: transpose of Matrix

Example::

jcb = pyemu.Jco.from_binary("pest.jcb")
jcbt = jcb.T

u property

the left singular vector Matrix

Returns:

Type Description

Matrix: left singular vectors. Shape is (Matrix.shape[0], Matrix.shape[0])

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
jco.u.to_binary("u.jcb")

v property

the right singular vector Matrix

Returns:

Type Description

Matrix: right singular vectors. Shape is (Matrix.shape[1], Matrix.shape[1])

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
jco.v.to_binary("v.jcb")

x property

return a reference to Matrix.x

Returns:

Type Description

numpy.ndarray: reference to Matrix.x

zero property

get a diagonal instance of Cov with all zeros on the diagonal

Returns:

Type Description

Cov: new Cov instance with zeros

zero2d property

get an 2D instance of self with all zeros

Returns:

Type Description

Matrix: Matrix of zeros

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
zero_jco = jco.zero2d

__add__(other)

Overload of numpy.ndarray.add() - elementwise addition. Tries to speedup by checking for scalars of diagonal matrices on either side of operator

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to add

required

Returns:

Type Description

Matrix: the result of addition

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If no names are shared, an exception is raised. The addition of a scalar does not expand non-zero elements.

Example::

m1 = pyemu.Cov.from_parameter_data(pst)
m1_plus_on1 = m1 + 1.0 #add 1.0 to all non-zero elements
m2 = m1.copy()
m2_plus_m1 = m1 + m2

__getitem__(item)

a very crude overload of object.getitem().

Parameters:

Name Type Description Default
item `object`

something that can be used as an index

required

Returns:

Type Description

Matrix: an object that is a sub-matrix of Matrix

__mul__(other)

Dot product multiplication overload. Tries to speedup by checking for scalars or diagonal matrices on either side of operator

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to dot product

required

Returns:

Type Description

Matrix: the result of dot product

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are found, an exception is raised

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
cov = pyemu.Cov.from_parmaeter_data(pst)
# get the forecast prior covariance matrix
forecast_cov = jco.get(pst.forecast_names).T * cov * jco.get(pst.forecast_names)

__pow__(power)

overload of numpy.ndarray.pow() operator

Parameters:

Name Type Description Default
power `float`

interpreted as follows: -1 = inverse of self, -0.5 = sqrt of inverse of self, 0.5 = sqrt of self. All other positive ints = elementwise self raised to power

required

Returns:

Type Description

Matrix: a new Matrix object raised to the power power

Example::

cov = pyemu.Cov.from_uncfile("my.unc")
sqcov = cov**2

__rmul__(other)

Reverse order Dot product multiplication overload.

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to dot product

required

Returns:

Type Description

Matrix: the result of dot product

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are found, an exception is raised

Example::

# multiply by a scalar
jco = pyemu.Jco.from_binary("pest.jcb")
jco_times_10 = 10 * jco

__set_svd()

private method to set SVD components.

Note: this should not be called directly

__str__()

overload of object.str()

Returns:

Type Description

str: string representation

__sub__(other)

numpy.ndarray.sub() overload. Tries to speedup by checking for scalars of diagonal matrices on either side of operator

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to subtract

required

Returns:

Type Description

Matrix: the result of subtraction

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If no names are shared, an exception is raised

Example::

jco1 = pyemu.Jco.from_binary("pest.1.jcb")
jco2 = pyemu.Jco.from_binary("pest.2.jcb")
diff = jco1 - jco2

align(names, axis=None)

reorder Matrix by names in place. If axis is None, reorder both indices

Parameters:

Name Type Description Default
names [str]

names in Matrix.row_names and or Matrix.col_names

required
axis `int`

the axis to reorder. if None, reorder both axes

None
Note

Works in place. Is called programmatically during linear algebra operations

Example::

# load a jco that has more obs (rows) than non-zero weighted obs
# in the control file
jco = pyemu.Jco.from_binary("pest.jcb")
# get an obs noise cov matrix
obscov = pyemu.Cov.from_observation_data(pst)
jco.align(obscov.row_names,axis=0)

condition_on(conditioning_elements)

get a new Covariance object that is conditional on knowing some elements. uses Schur's complement for conditional Covariance propagation

Parameters:

Name Type Description Default
conditioning_elements [str]

list of names of elements to condition on

required

Returns:

Type Description

Cov: new conditional Cov that assumes conditioning_elements have become known

Example::

prior_cov = pyemu.Cov.from_parameter_data(pst)
now_known_pars = pst.adj_par_names[:5]
post_cov = prior_cov.condition_on(now_known_pars)

copy()

get a copy of Matrix

Returns:

Type Description

Matrix: copy of this Matrix

df()

wrapper of Matrix.to_dataframe()

drop(names, axis)

drop elements from Matrix in place

Parameters:

Name Type Description Default
names [str]

list of names to drop

required
axis `int`

the axis to drop from. must be in [0,1]

required

element_isaligned(other)

check if matrices are aligned for element-wise operations

Parameters:

Name Type Description Default
other `Matrix`

the other matrix to check for alignment with

required

Returns:

Type Description

bool: True if aligned, False if not aligned

extend(other)

extend Matrix with the elements of other, returning a new matrix.

Args: other (Matrix): the Matrix to extend self by

Returns:

Type Description

Matrix: new, extended Matrix

Note

No row or column names can be shared between self and other

Example::

jco1 = pyemu.Jco.from_binary("pest_history.jco")
jco2 = pyemu.Jco.from_binary("pest_forecast.jco")

jco_ext = jco1.extend(jco2)

extract(row_names=None, col_names=None)

wrapper method that Matrix.gets() then Matrix.drops() elements. one of row_names or col_names must be not None.

Parameters:

Name Type Description Default
row_names [str]

row_names to extract. If None, all row_names are retained.

None
col_names [str]

col_names to extract. If None, all col_names are retained.

None

Returns:

Type Description

Matrix: the extract sub-matrix defined by row_names and/or col_names

Example::

cov = pyemu.Cov.from_parameter_data(pst)
hk_cov = cov.extract(row_names=["hk1","hk2","hk3"])

find_rowcol_indices(names, row_names, col_names, axis=None) staticmethod

fast(er) look of row and column names indices

Parameters:

Name Type Description Default
names [`str`]

list of names to look for in row_names and/or col_names names

required
row_names [`str`]

list of row names

required
col_names [`str`]

list of column names

required
axis `int`

axis to search along. If None, search both. Default is None

None

Returns:

Type Description

numpy.ndarray: array of (integer) index locations. If axis is

None, a 2 numpy.ndarrays of both row and column name indices is returned

from_ascii(filename) classmethod

load a PEST-compatible ASCII matrix/vector file into a Matrix instance

Parameters:

Name Type Description Default
filename `str`

name of the file to read

required

Returns:

Type Description

Matrix: Matrix loaded from ASCII file

Example::

mat = pyemu.Matrix.from_ascii("my.mat")
cov = pyemi.Cov.from_ascii("my.cov")

from_binary(filename, forgive=False) classmethod

class method load from PEST-compatible binary file into a Matrix instance

Parameters:

Name Type Description Default
filename `str`

filename to read

required
forgive `bool`

flag to forgive incomplete data records. Only applicable to dense binary format. Default is False

False

Returns:

Type Description

Matrix: Matrix loaded from binary file

Example::

mat = pyemu.Matrix.from_binary("my.jco")
cov = pyemu.Cov.from_binary("large_cov.jcb")

from_dataframe(df) classmethod

class method to create a new Matrix instance from a pandas.DataFrame

Parameters:

Name Type Description Default
df `pandas.DataFrame`

dataframe

required

Returns:

Type Description

Matrix: Matrix instance derived from df.

Example::

df = pd.read_csv("my.csv")
mat = pyemu.Matrix.from_dataframe(df)

from_fortranfile(filename) staticmethod

a binary load method to accommodate one of the many bizarre fortran binary writing formats

Parameters:

Name Type Description Default
filename `str`

name of the binary matrix file

required

Returns:

Type Description

tuple containing

  • numpy.ndarray: the numeric values in the file
  • ['str']: list of row names
  • [str]: list of col_names

from_names(row_names, col_names, isdiagonal=False, autoalign=True, random=False) classmethod

class method to create a new Matrix instance from row names and column names, filled with trash

Parameters:

Name Type Description Default
row_names [str]

row names for the new Matrix

required
col_names [str]

col_names for the new matrix

required
isdiagonal `bool`

flag for diagonal matrix. Default is False

False
autoalign `bool`

flag for autoaligning new matrix during linear algebra calcs. Default is True

True
random `bool`

flag for contents of the trash matrix. If True, fill with random numbers, if False, fill with zeros Default is False

False

Returns:

Type Description

Matrix: the new Matrix instance

Example::

row_names = ["row_1","row_2"]
col_names = ["col_1,"col_2"]
m = pyemu.Matrix.from_names(row_names,col_names)

from_observation_data(pst) classmethod

instantiates a Cov from pyemu.Pst.observation_data

Parameters:

Name Type Description Default
pst `pyemu.Pst`

control file instance

required

Returns:

Type Description

Cov: a diagonal observation noise covariance matrix derived from the

weights in the pest control file. Zero-weighted observations

are included with a weight of 1.0e-30

Example::

obscov = pyemu.Cov.from_observation_data(pst)

from_obsweights(pst_file) classmethod

instantiates a Cov instance from observation weights in a PEST control file.

Parameters:

Name Type Description Default
pst_file `str`

pest control file name

required

Returns:

Type Description

Cov: a diagonal observation noise covariance matrix derived from the

weights in the pest control file. Zero-weighted observations

are included with a weight of 1.0e-30

Note

Calls Cov.from_observation_data()

Example::

obscov = pyemu.Cov.from_obsweights("my.pst")

from_parameter_data(pst, sigma_range=4.0, scale_offset=True, subset=None) classmethod

Instantiates a Cov from a pest control file parameter data section using parameter bounds as a proxy for uncertainty.

Parameters:

Name Type Description Default
pst_file `str`

pest control file name

required
sigma_range `float`

defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution

4.0
scale_offset `bool`

flag to apply scale and offset to parameter upper and lower bounds before calculating variance. In some cases, not applying scale and offset can result in undefined (log) variance. Default is True.

True
subset `list`-like

Subset of parameters to draw

None

Returns:

Type Description

Cov: diagonal parameter Cov matrix created from parameter bounds

Note

Calls Cov.from_parameter_data()

from_parbounds(pst_file, sigma_range=4.0, scale_offset=True) classmethod

Instantiates a Cov from a pest control file parameter data section using parameter bounds as a proxy for uncertainty.

Parameters:

Name Type Description Default
pst_file `str`

pest control file name

required
sigma_range `float`

defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution

4.0
scale_offset `bool`

flag to apply scale and offset to parameter upper and lower bounds before calculating variance. In some cases, not applying scale and offset can result in undefined (log) variance. Default is True.

True

Returns:

Type Description

Cov: diagonal parameter Cov matrix created from parameter bounds

Note

Calls Cov.from_parameter_data()

from_uncfile(filename, pst=None) classmethod

instaniates a Cov from a PEST-compatible uncertainty file

Parameters:

Name Type Description Default
filename `str`

uncertainty file name

required
pst 'pyemu.Pst`

a control file instance. this is needed if "first_parameter" and "last_parameter" keywords. Default is None

None

Returns:

Type Description

Cov: Cov instance from uncertainty file

Example::

cov = pyemu.Cov.from_uncfile("my.unc")

get(row_names=None, col_names=None, drop=False)

get a new Matrix instance ordered on row_names or col_names

Parameters:

Name Type Description Default
row_names [str]

row_names for new Matrix. If None, all row_names are used.

None
col_names [str]

col_names for new Matrix. If None, all col_names are used.

None
drop `bool`

flag to remove row_names and/or col_names from this Matrix

False

Returns:

Type Description

Matrix: a new Matrix

Example:: # load a jco that has more obs (rows) than non-zero weighted obs # in the control file jco = pyemu.Jco.from_binary("pest.jcb") # get an obs noise cov matrix obscov = pyemu.Cov.from_observation_data(pst) nnz_jco = jco.get(row_names = obscov.row_names)

get_dense_binary_info(filename) staticmethod

read the header and row and offsets for a dense binary file.

Parameters
filename (`str`): dense binary filename

Returns:

Type Description

tuple containing

  • ['str']: list of row names
  • ['int']: list of row offsets
  • [str]: list of col names
  • bool: flag indicating successful reading of all records found

get_diagonal_vector(col_name='diag')

Get a new Matrix instance that is the diagonal of self. The shape of the new matrix is (self.shape[0],1). Self must be square

Parameters:

Name Type Description Default
col_name `str`

the name of the single column in the new Matrix

'diag'

Returns:

Type Description

Matrix: vector-shaped Matrix instance of the diagonal of this Matrix

Example::

cov = pyemu.Cov.from_unc_file("my.unc")
cov_diag = cov.get_diagonal_vector()
print(cov_diag.col_names)

get_maxsing(eigthresh=1e-05)

Get the number of singular components with a singular value ratio greater than or equal to eigthresh

Args: eigthresh (float): the ratio of smallest to largest singular value to retain. Since it is assumed that s is sorted from largest to smallest, once a singular value is reached that yields a ratio with the first (largest) singular value, the index of this singular is returned.

Returns:

Type Description

int: the index of the singular value who's ratio with the

first singular value is less than or equal to eigthresh

Note

this method calls the static method Matrix.get_maxsing_from_s() with Matrix.s.x

Example::

jco = pyemu.Jco.from_binary("pest.jco")
max_sing = jco.get_maxsing(eigthresh=pst.svd_data.eigthresh)

get_maxsing_from_s(s, eigthresh=1e-05) staticmethod

static method to work out the maxsing for a given singular spectrum

Parameters:

Name Type Description Default
s `numpy.ndarray`

1-D array of singular values. This array should come from calling either numpy.linalg.svd or from the pyemu.Matrix.s.x attribute

required
eigthresh `float`

the ratio of smallest to largest singular value to retain. Since it is assumed that s is sorted from largest to smallest, once a singular value is reached that yields a ratio with the first (largest) singular value, the index of this singular is returned.

1e-05

Returns:

Type Description

int: the index of the singular value who's ratio with the

first singular value is less than or equal to eigthresh

Example::

jco = pyemu.Jco.from_binary("pest.jco")
max_sing = pyemu.Matrix.get_maxsing_from_s(jco.s,eigthresh=pst.svd_data.eigthresh)

hadamard_product(other)

Overload of numpy.ndarray.mult(): element-wise multiplication. Tries to speedup by checking for scalars of diagonal matrices on either side of operator

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to multiply

required

Returns:

Type Description

Matrix: the result of multiplication

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are shared, an exception is raised

Example::

cov = pyemu.Cov.from_parameter_data(pst)
cov2 = cov * 10
cov3 = cov * cov2

identity_like(other) classmethod

Get an identity matrix Cov instance like other Cov

Parameters:

Name Type Description Default
other `Matrix`

other matrix - must be square

required

Returns:

Type Description

Cov: new identity matrix Cov with shape of other

Note

the returned identity cov matrix is treated as non-diagonal

indices(names, axis=None)

get the row and col indices of names. If axis is None, two ndarrays are returned, corresponding the indices of names for each axis

Parameters:

Name Type Description Default
names [`str`]

list of names to look for in row_names and/or col_names names

required
row_names [`str`]

list of row names

required
col_names [`str`]

list of column names

required
axis `int`

axis to search along. If None, search both. Default is None

None

Returns:

Type Description

numpy.ndarray: array of (integer) index locations. If axis is

None, a 2 numpy.ndarrays of both row and column name indices is returned

Note

thin wrapper around Matrix.find_rowcol_indices static method

mult_isaligned(other)

check if matrices are aligned for dot product multiplication

Parameters:

Name Type Description Default
other `Matrix`

the other matrix to check for alignment with

required

Returns:

Type Description

bool: True if aligned, False if not aligned

pseudo_inv(maxsing=None, eigthresh=1e-05)

The pseudo inverse of self. Formed using truncated singular value decomposition and Matrix.pseudo_inv_components

Parameters:

Name Type Description Default
maxsing `int`

the number of singular components to use. If None, maxsing is calculated using Matrix.get_maxsing() and eigthresh

None
`eigthresh`

(float, optional): the ratio of largest to smallest singular components to use for truncation. Ignored if maxsing is not None. Default is 1.0e-5

required

Returns:

Type Description

Matrix: the truncated-SVD pseudo inverse of Matrix (V_1 * s_1^-1 * U^T)

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
jco_psinv = jco.pseudo_inv(pst.svd_data.maxsing,pst.svd_data.eigthresh)
jco_psinv.to_binary("pseudo_inv.jcb")

pseudo_inv_components(maxsing=None, eigthresh=1e-05, truncate=True)

Get the (optionally) truncated SVD components

Parameters:

Name Type Description Default
maxsing `int`

the number of singular components to use. If None, maxsing is calculated using Matrix.get_maxsing() and eigthresh

None
`eigthresh`

(float, optional): the ratio of largest to smallest singular components to use for truncation. Ignored if maxsing is not None. Default is 1.0e-5

required
truncate `bool`

flag to truncate components. If False, U, s, and V will be zeroed out at locations greater than maxsing instead of truncated. Default is True

True

Returns:

Type Description

tuple containing

  • Matrix: (optionally truncated) left singular vectors
  • Matrix: (optionally truncated) singular value matrix
  • Matrix: (optionally truncated) right singular vectors

Example::

mat = pyemu.Matrix.from_binary("my.jco")
u1,s1,v1 = mat.pseudo_inv_components(maxsing=10)
resolution_matrix = v1 * v1.T
resolution_matrix.to_ascii("resol.mat")

read_ascii(filename) staticmethod

read a PEST-compatible ASCII matrix/vector file

Parameters:

Name Type Description Default
filename `str`

file to read from

required

Returns:

Type Description

tuple containing

  • numpy.ndarray: numeric values
  • ['str']: list of row names
  • [str]: list of column names
  • bool: diagonal flag

read_binary(filename, forgive=False) staticmethod

static method to read PEST-format binary files

Parameters:

Name Type Description Default
filename `str`

filename to read

required
forgive `bool`

flag to forgive incomplete data records. Only applicable to dense binary format. Default is False

False

Returns:

Type Description

tuple containing

  • numpy.ndarray: the numeric values in the file
  • ['str']: list of row names
  • [str]: list of col_names

read_binary_header(filename) staticmethod

read the first elements of a PEST(++)-style binary file to get format and dimensioning information.

Parameters:

Name Type Description Default
filename `str`

the filename of the binary file

required

Returns:

Type Description

tuple containing

  • int: the itemp1 value
  • int: the itemp2 value
  • int: the icount value

read_dense(filename, forgive=False, close=True, only_rows=None) staticmethod

read a dense-format binary file.

Parameters:

Name Type Description Default
filename `str`

the filename or the open filehandle

required
forgive `bool`

flag to forgive incomplete records. If True and an incomplete record is encountered, only the previously read records are returned. If False, an exception is raised for an incomplete record

False
close `bool`

flag to close the filehandle. Default is True

True
only_rows `iterable`

rows to read. If None, all rows are read

None

Returns:

Type Description

tuple containing

  • numpy.ndarray: the numeric values in the file
  • ['str']: list of row names
  • [str]: list of col_names

replace(other)

replace elements in the covariance matrix with elements from other. if other is not diagonal, then this Cov becomes non diagonal

Parameters:

Name Type Description Default
`Cov`

the Cov to replace elements in this Cov with

required
Note

operates in place. Other must have the same row-col names as self

reset_x(x, copy=True)

reset self.__x private attribute

Parameters:

Name Type Description Default
x `numpy.ndarray`

the new numeric data

required
copy `bool`

flag to make a copy of 'x'. Default is True

True

Returns:

Type Description

None

Note

operates in place

to_2d()

get a 2D Matrix representation of Matrix. If not Matrix.isdiagonal, simply return a copy of Matrix, otherwise, constructs and returns a new Matrix instance that is stored as diagonal

Returns:

Type Description

Martrix: non-diagonal form of Matrix

Example::

# A diagonal cov
cov = pyemu.Cov.from_parameter_data
cov2d = cov.as_2d # a numpy 2d array
print(cov.shape,cov.x.shape,cov2d.shape,cov2d.x.shape)

to_ascii(filename, icode=2)

write a PEST-compatible ASCII Matrix/vector file

Parameters:

Name Type Description Default
filename `str`

filename to write to

required
icode `int`

PEST-style info code for matrix style. Default is 2.

2

Note: if icode == -1, a 1-d vector is written that represents a diagonal matrix. An exception is raised if icode == -1 and isdiagonal is False

to_binary(filename, droptol=None, chunk=None)

write a PEST-compatible binary file. The format is the same as the format used to storage a PEST Jacobian matrix

Parameters:

Name Type Description Default
filename `str`

filename to save binary file

required
droptol `float`

absolute value tolerance to make values smaller droptol than zero. Default is None (no dropping)

None
chunk `int`

number of elements to write in a single pass. Default is None, which writes the entire numeric part of the Matrix at once. This is faster but requires more memory.

None

to_coo(filename, droptol=None, chunk=None)

write an extended PEST-format binary file. The data format is [int,int,float] for i,j,value. It is autodetected during the read with Matrix.from_binary().

Parameters:

Name Type Description Default
filename `str` or `Path`

filename to save binary file

required
droptol `float`

absolute value tolerance to make values smaller droptol than zero. Default is None (no dropping)

None
chunk `int`

number of elements to write in a single pass. Default is None, which writes the entire numeric part of the Matrix at once. This is faster but requires more memory.

None
Note

This method is needed when the number of dimensions times 2 is larger than the max value for a 32-bit integer. happens! This method is used by pyemu.Ensemble.to_binary()

to_dataframe()

return a pandas.DataFrame representation of Matrix

Returns:

Type Description

pandas.DataFrame: a dataframe derived from Matrix

Note

if self.isdiagonal is True, the full matrix is used to fill the dataframe - lots of zeros.

to_dense(filename, close=True)

experimental new dense matrix storage format to support faster I/O with ensembles

Parameters:

Name Type Description Default
filename `str`

the filename to save to

required
close `bool`

flag to close the filehandle after saving

True

Returns:

Name Type Description
f `file`

the file handle. Only returned if close is False

Note

calls Matrix.write_dense()

to_pearson()

Convert Cov instance to Pearson correlation coefficient matrix

Returns:

Type Description

Matrix: A Matrix of correlation coefs. Return type is Matrix

on purpose so that it is clear the returned instance is not a Cov

Example::

# plot the posterior parameter correlation matrix
import matplotlib.pyplot as plt
cov = pyemu.Cov.from_ascii("pest.post.cov")
cc = cov.to_pearson()
cc.x[cc.x==1.0] = np.nan
plt.imshow(cc)

to_uncfile(unc_file, covmat_file='cov.mat', var_mult=1.0, include_path=False)

write a PEST-compatible uncertainty file

Parameters:

Name Type Description Default
unc_file `str`

filename of the uncertainty file

required
covmat_file `str`

covariance matrix filename. Default is "Cov.mat". If None, and Cov.isdiaonal, then a standard deviation form of the uncertainty file is written. Exception raised if covmat_file is None and not Cov.isdiagonal

'cov.mat'
var_mult `float`

variance multiplier for the covmat_file entry

1.0
include_path `bool`

flag to include the path of unc_file in the name of covmat_file. Default is False - not sure why you would ever make this True...

False

Example::

cov = pyemu.Cov.from_parameter_data(pst)
cov.to_uncfile("my.unc")

write_dense(filename, row_names, col_names, data, close=False) staticmethod

experimental new dense matrix storage format to support faster I/O with ensembles

Parameters:

Name Type Description Default
filename `str` or `file`

the file to write to

required
row_names [`str`]

row names of the matrix

required
col_names [`str`]

col names of the matrix

required
data `np.ndarray`

matrix elements

required
close `bool`

flag to close the file after writing

False

Returns:

Name Type Description
f `file`

the file handle. Only returned if close is False

Ensemble

Bases: object

based class for handling ensembles of numeric values

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance

required
df `pandas.DataFrame`

a pandas dataframe. Columns should be parameter/observation names. Index is treated as realization names

required
istransformed `bool`

flag to indicate parameter values are in log space. Not used for ObservationEnsemble

False

Example::

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)

istransformed property

the parameter transformation status

Returns:

Type Description

bool: flag to indicate whether or not the ParameterEnsemble is

transformed with respect to log_{10}. Not used for (and has no effect

on) ObservationEnsemble.

Note

parameter transformation status is only related to log_{10} and does not include the effects of scale and/or offset

pst = pst instance-attribute

pyemu.Pst: control file instance

as_pyemu_matrix(typ=None)

get a pyemu.Matrix instance of Ensemble

Parameters:

Name Type Description Default
typ `pyemu.Matrix` or `pyemu.Cov`

the type of matrix to return. Default is pyemu.Matrix

None

Returns:

Type Description

pyemu.Matrix: a matrix instance

Example::

oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst=pst,num_reals=100)
dev_mat = oe.get_deviations().as_pyemu_matrix(typ=pyemu.Cov)
obscov = dev_mat.T * dev_mat

back_transform()

back transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

Ensemble.back_transform() is only provided for inheritance purposes. It only changes theEnsemble._transformed` flag

copy()

get a copy of Ensemble

Returns:

Type Description

Ensemble: copy of this Ensemble

Note

copies both Ensemble.pst and Ensemble._df: can be expensive

covariance_matrix(localizer=None, center_on=None)

get a empirical covariance matrix implied by the correlations between realizations

Parameters:

Name Type Description Default
localizer `pyemu.Matrix`

a matrix to localize covariates in the resulting covariance matrix. Default is None

None
center_on `str`

a realization name to use as the centering point in ensemble space. If None, the mean vector is treated as the centering point. Default is None

None

Returns:

Type Description

pyemu.Cov: the empirical (and optionally localized) covariance matrix

dropna(*args, **kwargs)

override of pandas.DataFrame.dropna()

Parameters:

Name Type Description Default
*args ([`object`]

positional arguments to pass to pandas.DataFrame.dropna().

required
**kwargs ({`str`

object}): keyword arguments to pass to pandas.DataFrame.dropna().

required

from_binary(pst, filename) classmethod

create an Ensemble from a PEST-style binary file

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance

required
filename `str`

filename containing binary ensemble

required

Returns:

Type Description

Ensemble: the ensemble loaded from the binary file

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_binary("obs.jcb")

from_csv(pst, filename, *args, **kwargs) classmethod

create an Ensemble from a CSV file

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance

required
filename `str`

filename containing CSV ensemble

required
*args ([`object`]

positional arguments to pass to pandas.read_csv().

required
**kwargs ({`str`

object}): keyword arguments to pass to pandas.read_csv().

required

Returns: Ensemble Note: uses pandas.read_csv() to load numeric values from CSV file

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_csv("obs.csv")

get_deviations(center_on=None)

get the deviations of the realizations around a certain point in ensemble space

Parameters:

Name Type Description Default
center_on `str`

a realization name to use as the centering point in ensemble space. If None, the mean vector is treated as the centering point. Default is None

None

Returns:

Type Description

Ensemble: an ensemble of deviations around the centering point

Note

deviations are the Euclidean distances from the center_on value to realized values for each column

center_on=None yields the classic ensemble smoother/ensemble Kalman filter deviations from the mean vector

Deviations respect log-transformation status.

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.add_base()
oe_dev = oe.get_deviations(center_on="base")
oe.to_csv("obs_base_devs.csv")

plot(bins=10, facecolor='0.5', plot_cols=None, filename='ensemble.pdf', func_dict=None, **kwargs)

plot ensemble histograms to multipage pdf

Parameters:

Name Type Description Default
bins `int`

number of bins for the histograms

10
facecolor `str`

matplotlib color (e.g. r,g, etc)

'0.5'
plot_cols [`str`]

list of subset of ensemble columns to plot. If None, all are plotted. Default is None

None
filename `str`

multipage pdf filename. Default is "ensemble.pdf"

'ensemble.pdf'
func_dict `dict`

a dict of functions to apply to specific columns. For example: {"par1": np.log10}

None
**kwargs `dict`

addkeyword args to pass to pyemu.plot_utils.ensemble_helper()

{}

Example::

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)
pe.transform() # plot log space (if needed)
pe.plot(bins=30)

reseed() staticmethod

reset the pyemu.en.rng local random generator

Note

reseeds using the pyemu.en.SEED global variable

The pyemu.en.SEED value is used to initialize the rng on import, so make sure you know what you are doing if you call this method...

to_binary(filename)

write Ensemble to a PEST-style binary file

Parameters:

Name Type Description Default
filename `str` or `Path`

file to write

required

Returns:

Type Description

str: the filename written (may be modified from input)

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_binary("obs.jcb")
Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

to_csv(filename, *args, **kwargs)

write Ensemble to a CSV file

Parameters:

Name Type Description Default
filename `str`

file to write

required
*args ([`object`]

positional arguments to pass to pandas.DataFrame.to_csv().

required
**kwargs ({`str`

object}): keyword arguments to pass to pandas.DataFrame.to_csv().

required

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_csv("obs.csv")
Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

to_dense(filename)

write Ensemble to a dense-format binary file

Parameters:

Name Type Description Default
filename `str` or `Path`

file to write

required

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_dense("obs.bin")
Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

transform()

transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

Ensemble.transform() is only provided for inheritance purposes. It only changes theEnsemble._transformed` flag

ErrVar

Bases: LinearAnalysis

FOSM-based error variance analysis

Parameters:

Name Type Description Default
jco varies

something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.

required
pst varies

something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)

required
parcov varies

prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (".jcb"/".jco" for binary, ".cov"/".mat" for PEST-style ASCII matrix, or ".unc" for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance

required
obscov varies

observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (".jcb"/".jco" for binary, ".cov"/".mat" for PEST-style ASCII matrix, or ".unc" for uncertainty files). If None, the noise covariance matrix is constructed from the observation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance

required
forecasts varies

forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection. Note if the PEST++ option "++forecasts()" is set in the pest control file (under the pyemu.Pst.pestpp_options dictionary), then there is no need to pass this argument (unless you want to analyze different forecasts) of pyemu.Matrix or numpy.ndarray.

required
ref_var float

reference variance. Default is 1.0

required
verbose `bool`

controls screen output. If str, a filename is assumed and and log file is written.

required
sigma_range `float`

defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.

required
scale_offset `bool`

flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.

required
omitted_parameters [`str`]

list of parameters to treat as "omitted". Passing this argument activates 3-term error variance analysis.

required
omitted_parcov varies

an argument that can be cast to a parcov for the omitted parameters. If None, omitted_parcov will be formed by extracting a sub-matrix from the LinearAnalsis.parcov attribute.

required
omitted_predictions varies

an argument that can be cast to a "predictions" (e.g. "forecasts") attribute to form prediction sensitivity vectors with respec to the omitted parameters. If None, these vectors will be extracted from the pyemu.LinearAnalysis.predictions attribute

required
kl `bool`

flag to perform Karhunen-Loeve scaling on the jacobian before error variance calculations. If True, the pyemu.ErrVar.jco and pyemu.ErrVar.parcov are altered in place. Default is False.

required

Example::

ev = pyemu.ErrVar(jco="my.jco",omitted_parameters=["wel1","wel2"])
df = ev.get_errvar_dataframe()

adj_par_names property

adjustable parameter names

Returns:

Type Description

['str`]: list of adjustable parameter names

Note

if LinearAnalysis.pst is None, returns LinearAnalysis.jco.col_names

fehalf property

Karhunen-Loeve scaling matrix attribute.

Returns:

Type Description

pyemu.Matrix: the Karhunen-Loeve scaling matrix based on the prior

parameter covariance matrix

forecast_names property

get the forecast (aka prediction) names

Returns:

Type Description
[`str`]

list of forecast names

forecasts property

the forecast sentivity vectors attribute

Returns:

Type Description

pyemu.Matrix: a matrix of forecast (prediction) sensitivity vectors (column wise)

forecasts_iter property

forecast (e.g. prediction) sensitivity vectors iterator

Returns:

Type Description

iterator: iterator on forecasts (e.g. predictions) sensitivity vectors (matrix)

Note

This is used for processing huge numbers of predictions

This is a synonym for LinearAnalysis.predictions_iter()

jco property

the jacobian matrix attribute

Returns:

Type Description

pyemu.Jco: the jacobian matrix attribute

mle_covariance property

maximum likelihood parameter covariance matrix.

Returns:

Type Description

pyemu.Matrix: maximum likelihood parameter covariance matrix

mle_parameter_estimate property

maximum likelihood parameter estimate.

Returns:

Type Description

pandas.Series: the maximum likelihood parameter estimates

nnz_obs_names property

non-zero-weighted observation names

Returns:

Type Description

['str`]: list of non-zero-weighted observation names

Note

if LinearAnalysis.pst is None, returns LinearAnalysis.jco.row_names

obscov property

get the observation noise covariance matrix attribute

Returns:

Type Description

pyemu.Cov: a reference to the LinearAnalysis.obscov attribute

omitted_jco property

the omitted-parameters jacobian matrix

Returns:

Type Description

pyemu.Jco: the jacobian matrix instance of non-zero-weighted observations and

omitted parameters

omitted_parcov property

the prior omitted-parameter covariance matrix

Returns:

Type Description

pyemu.Cov: the prior parameter covariance matrix of the

omitted parameters

omitted_predictions property

omitted prediction sensitivity vectors

Returns:

Type Description

pyemu.Matrix: a matrix of prediction sensitivity vectors (column wise) to

omitted parameters

parcov property

get the prior parameter covariance matrix attribute

Returns:

Type Description

pyemu.Cov: a reference to the LinearAnalysis.parcov attribute

predictions property

the prediction (aka forecast) sentivity vectors attribute

Returns:

Type Description

pyemu.Matrix: a matrix of prediction sensitivity vectors (column wise)

predictions_iter property

prediction sensitivity vectors iterator

Returns:

Type Description

iterator: iterator on prediction sensitivity vectors (matrix)

Note

this is used for processing huge numbers of predictions

prior_forecast property

prior forecast (e.g. prediction) variances

Returns:

Type Description

dict: a dictionary of forecast name, prior variance pairs

prior_parameter property

prior parameter covariance matrix

Returns:

Type Description

pyemu.Cov: prior parameter covariance matrix

prior_prediction property

prior prediction (e.g. forecast) variances

Returns:

Type Description

dict: a dictionary of prediction name, prior variance pairs

pst property

the pst attribute

Returns:

Type Description

pyemu.Pst: the pst attribute

qhalf property

square root of the cofactor matrix attribute. Create the attribute if it has not yet been created

Returns:

Type Description

pyemu.Matrix: square root of the cofactor matrix

qhalfx property

half normal matrix attribute.

Returns:

Type Description

pyemu.Matrix: half normal matrix attribute

xtqx property

normal matrix attribute.

Returns:

Type Description

pyemu.Matrix: normal matrix attribute

G(singular_value)

get the parameter solution Matrix at a given singular value

Parameters:

Name Type Description Default
singular_value `int`

singular value to calc G at

required

Returns:

Type Description

pyemu.Matrix: parameter solution matrix (V_1 * S_1^(_1) * U_1^T) at singular_value

I_minus_R(singular_value)

get I - R at a given singular value

Parameters:

Name Type Description Default
singular_value `int`

singular value to calculate I - R at

required

Returns:

Type Description

pyemu.Matrix: identity matrix minus resolution matrix at singular_value

R(singular_value)

get resolution Matrix (V_1 * V_1^T) at a given singular value

Args: singular_value (int): singular value to calculate R at

Returns:

Type Description

pyemu.Matrix: resolution matrix at singular_value

__fromfile(filename, astype=None)

a private method to deduce and load a filename into a matrix object. Uses extension: 'jco' or 'jcb': binary, 'mat','vec' or 'cov': ASCII, 'unc': pest uncertainty file.

__load_jco()

private method to set the jco attribute from a file or a matrix object

__load_obscov()

private method to set the obscov attribute from: a pest control file (observation weights) a pst object a matrix object an uncert file an ascii matrix file

__load_omitted_jco()

private: set the omitted jco attribute

__load_omitted_parcov()

private: set the omitted_parcov attribute

__load_omitted_predictions()

private: set the omitted_predictions attribute

__load_parcov()

private method to set the parcov attribute from: a pest control file (parameter bounds) a pst object a matrix object an uncert file an ascii matrix file

__load_predictions()

private method set the predictions attribute from

mixed list of row names, matrix files and ndarrays a single row name an ascii file

can be none if only interested in parameters.

__load_pst()

private method set the pst attribute

adjust_obscov_resfile(resfile=None)

reset the elements of obscov by scaling the implied weights based on the phi components in res_file so that the total phi is equal to the number of non-zero weights.

Parameters:

Name Type Description Default
resfile `str`

residual file to use. If None, residual file with case name is sought. default is None

None
Note

calls pyemu.Pst.adjust_weights_resfile()

apply_karhunen_loeve_scaling()

apply karhuene-loeve scaling to the jacobian matrix.

Note:

This scaling is not necessary for analyses using Schur's
complement, but can be very important for error variance
analyses.  This operation effectively transfers prior knowledge
specified in the parcov to the jacobian and reset parcov to the
identity matrix.

clean()

drop regularization and prior information observation from the jco

drop_prior_information()

drop the prior information from the jco and pst attributes

first_forecast(singular_value)

get the null space term (first term) contribution to forecast (e.g. prediction) error variance at a given singular value.

Parameters:

Name Type Description Default
singular_value `int`

singular value to calc first term at

required
Note

This method is used to construct the error variance dataframe

Just a wrapper around ErrVar.first_forecast

Returns:

Type Description

dict: dictionary of ("first",prediction_names),error variance pairs at singular_value

first_parameter(singular_value)

get the null space term (first term) contribution to parameter error variance at a given singular value

Parameters:

Name Type Description Default
singular_value `int`

singular value to calc first term at

required

Returns:

Type Description

pyemu.Cov: first term contribution to parameter error variance

first_prediction(singular_value)

get the null space term (first term) contribution to prediction error variance at a given singular value.

Parameters:

Name Type Description Default
singular_value `int`

singular value to calc first term at

required
Note

This method is used to construct the error variance dataframe

Returns:

Type Description

dict: dictionary of ("first",prediction_names),error variance pairs at singular_value

get(par_names=None, obs_names=None, astype=None)

method to get a new LinearAnalysis class using a subset of parameters and/or observations

Parameters:

Name Type Description Default
par_names [`'str`]

par names for new object

None
obs_names [`'str`]

obs names for new object

None
astype `pyemu.Schur` or `pyemu.ErrVar`

type to cast the new object. If None, return type is same as self

None

Returns:

Type Description

LinearAnalysis: new instance

get_cso_dataframe()

get a dataframe of composite observation sensitivity, as returned by PEST in the seo file.

Returns:

Type Description

pandas.DataFrame: dataframe of observation names and composite observation

sensitivity

Note

That this formulation deviates slightly from the PEST documentation in that the values are divided by (npar-1) rather than by (npar).

The equation is cso_j = ((Q^1/2JJ^T*Q^1/2)^1/2)_jj/(NPAR-1)

get_errvar_dataframe(singular_values=None)

primary entry point for error variance analysis.

Parameters:

Name Type Description Default
singular_values [`int`]

a list singular values to test. If None, defaults to range(0,min(nnz_obs,nadj_par) + 1).

None

Returns:

Type Description

pandas.DataFrame: a multi-indexed pandas dataframe summarizing each of the

error variance terms for each nominated forecast. Rows are the singular values

tested, columns are a multi-index of forecast name and error variance term number

(e.g. 1,2 or (optionally) 3).

Example::

ev = pyemu.ErrVar(jco="my.jco",omitted_parameters=["wel1","wel2"])
df = ev.get_errvar_dataframe()

get_identifiability_dataframe(singular_value=None, precondition=False)

primary entry point for identifiability analysis

Parameters:

Name Type Description Default
singular_value `int`

the singular spectrum truncation point. Defaults to minimum of non-zero-weighted observations and adjustable parameters

None
precondition `bool`

flag to use the preconditioned hessian with the prior parameter covariance matrix (xtqt + sigma_theta^-1). This should be used KL scaling. Default is False.

False

Returns:

Type Description

pandas.DataFrame: A pandas dataframe of the right solution-space singular

vectors and identifiability (identifiabiity is in the column labeled "ident")

Examples::

ev = pyemu.ErrVar(jco="my.jco")
df = ev.get_identifiability_dataframe(singular_value=20)
df.ident.plot(kind="bar")

get_null_proj(maxsing=None, eigthresh=1e-06)

get a null-space projection matrix of XTQX

Parameters:

Name Type Description Default
maxsing `int`

number of singular components to use (the truncation point). If None, pyemu.Matrx.get_maxsing() is used to determine the truncation point witheigthresh`. Default is None

None
eigthresh `float`

the ratio of smallest to largest singular value to keep in the range (solution) space of XtQX. Not used if maxsing is not None. Default is 1.0e-6

1e-06
Note

used for null-space monte carlo operations.

Returns:

Type Description

pyemu.Matrix the null-space projection matrix (V2V2^T)

get_obs_competition_dataframe()

get the observation competition stat a la PEST utility

Returns:

Type Description

pandas.DataFrame: a dataframe of observation names by

observation names with values equal to the PEST

competition statistic

get_par_css_dataframe()

get a dataframe of composite scaled sensitivities. Includes both PEST-style and Hill-style.

Returns:

Type Description

pandas.DataFrame: a dataframe of parameter names, PEST-style and

Hill-style composite scaled sensitivity

reset_obscov(arg=None)

reset the obscov attribute to None

Parameters:

Name Type Description Default
arg `str` or `pyemu.Matrix`

the value to assign to the obscov attribute. If None, the private __obscov attribute is cleared but not reset

None

reset_parcov(arg=None)

reset the parcov attribute to None

Parameters:

Name Type Description Default
arg `str` or `pyemu.Matrix`

the value to assign to the parcov attribute. If None, the private __parcov attribute is cleared but not reset

None

reset_pst(arg)

reset the LinearAnalysis.pst attribute

Parameters:

Name Type Description Default
arg `str` or `pyemu.Pst`

the value to assign to the pst attribute

required

second_forecast(singular_value)

get the solution space contribution to forecast (e.g. "prediction") error variance at a given singular value

Parameters:

Name Type Description Default
singular_value `int`

singular value to calc second term at

required
Note

This method is used to construct error variance dataframe

Just a thin wrapper around ErrVar.second_prediction

Returns:

Type Description

dict: dictionary of ("second",prediction_names), error variance

arising from the solution space contribution (y^t * G * obscov * G^T * y)

second_parameter(singular_value)

get the solution space contribution to parameter error variance at a given singular value (G * obscov * G^T)

Parameters:

Name Type Description Default
singular_value `int`

singular value to calc second term at

required

Returns:

Type Description

pyemu.Cov: the second term contribution to parameter error variance

(G * obscov * G^T)

second_prediction(singular_value)

get the solution space contribution to predictive error variance at a given singular value

Parameters:

Name Type Description Default
singular_value `int`

singular value to calc second term at

required
Note

This method is used to construct error variance dataframe

`dict`: dictionary of ("second",prediction_names), error variance

Type Description

arising from the solution space contribution (y^t * G * obscov * G^T * y)

third_forecast(singular_value)

get the omitted parameter contribution to forecast (prediction) error variance at a given singular value.

Parameters:

Name Type Description Default
singular_value `int`

singular value to calc third term at

required
Note

used to construct error variance dataframe just a thin wrapper around ErrVar.third_prediction()

Returns:

Type Description

dict: a dictionary of ("third",prediction_names),error variance

third_parameter(singular_value)

get the omitted parameter contribution to parameter error variance at a given singular value

Parameters:

Name Type Description Default
singular_value `int`

singular value to calc third term at

required

Returns:

Type Description

pyemu.Cov: the third term contribution to parameter error variance

calculated at singular_value (G * omitted_jco * Sigma_(omitted_pars) *

omitted_jco^T * G^T). Returns 0.0 if third term calculations are not

being used.

third_prediction(singular_value)

get the omitted parameter contribution to prediction error variance at a given singular value.

Parameters:

Name Type Description Default
singular_value `int`

singular value to calc third term at

required
Note

used to construct error variance dataframe

Returns:

Type Description

dict: a dictionary of ("third",prediction_names),error variance

variance_at(singular_value)

get the error variance of all three error variance terms at a given singular value

Parameters:

Name Type Description Default
singular_value `int`

singular value to test

required

Returns:

Type Description

dict: dictionary of (err var term,prediction_name), variance pairs

Jco

Bases: Matrix

a thin wrapper class to get more intuitive attribute names. Functions exactly like Matrix

T property

wrapper function for Matrix.transpose() method

Returns:

Type Description

Matrix: transpose of Matrix

Note

returns a copy of self

A syntatic-sugar overload of Matrix.transpose()

Example::

jcb = pyemu.Jco.from_binary("pest.jcb")
jcbt = jcb.T

as_2d property

get a 2D numeric representation of Matrix.x. If not isdiagonal, simply return reference to Matrix.x, otherwise, constructs and returns a 2D, diagonal ndarray

Returns:

Type Description

numpy.ndarray : numpy.ndarray

Example::

# A diagonal cov
cov = pyemu.Cov.from_parameter_data
x2d = cov.as_2d # a numpy 2d array
print(cov.shape,cov.x.shape,x2d.shape)

full_s property

Get the full singular value matrix

Returns:

Type Description

Matrix: full singular value matrix. Shape is (max(Matrix.shape),max(Matrix.shape))

with zeros along the diagonal from min(Matrix.shape) to max(Matrix.shape)

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
jco.full_s.to_ascii("full_sing_val_matrix.mat")

inv property

inversion operation of Matrix

Returns:

Type Description

Matrix: inverse of Matrix

Note

uses numpy.linalg.inv for the inversion

Example::

mat = pyemu.Matrix.from_binary("my.jco")
mat_inv = mat.inv
mat_inv.to_binary("my_inv.jco")

ncol property

length of second dimension

Returns:

Type Description

int: number of columns

newx property

return a copy of Matrix.x attribute

Returns:

Type Description

numpy.ndarray: a copy Matrix.x

nobs property

number of observations in the Jco

Returns:

Type Description

int: number of observations (rows)

npar property

number of parameters in the Jco

Returns:

Type Description

int: number of parameters (columns)

nrow property

length of first dimension

Returns:

Type Description

int: number of rows

obs_names property

thin wrapper around Matrix.row_names

Returns:

Type Description

['str']: a list of observation names

par_names property

thin wrapper around Matrix.col_names

Returns:

Type Description

[str]: a list of parameter names

s property

the singular value (diagonal) Matrix

Returns:

Type Description

Matrix: singular value matrix. shape is (min(Matrix.shape),min(Matrix.shape))

Example::

# plot the singular spectrum of the jacobian
import matplotlib.pyplot as plt
jco = pyemu.Jco.from_binary("pest.jcb")
plt.plot(jco.s.x)
plt.show()

shape property

get the implied, 2D shape of Matrix

Returns:

Type Description

int: length of 2 tuple

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
shape = jco.shape
nrow,ncol = shape #unpack to ints

sqrt property

element-wise square root operation

Returns:

Type Description

Matrix: element-wise square root of Matrix

Note

uses numpy.sqrt

Example::

cov = pyemu.Cov.from_uncfile("my.unc")
sqcov = cov.sqrt
sqcov.to_uncfile("sqrt_cov.unc")

transpose property

transpose operation of self

Returns:

Type Description

Matrix: transpose of Matrix

Example::

jcb = pyemu.Jco.from_binary("pest.jcb")
jcbt = jcb.T

u property

the left singular vector Matrix

Returns:

Type Description

Matrix: left singular vectors. Shape is (Matrix.shape[0], Matrix.shape[0])

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
jco.u.to_binary("u.jcb")

v property

the right singular vector Matrix

Returns:

Type Description

Matrix: right singular vectors. Shape is (Matrix.shape[1], Matrix.shape[1])

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
jco.v.to_binary("v.jcb")

x property

return a reference to Matrix.x

Returns:

Type Description

numpy.ndarray: reference to Matrix.x

zero2d property

get an 2D instance of self with all zeros

Returns:

Type Description

Matrix: Matrix of zeros

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
zero_jco = jco.zero2d

__add__(other)

Overload of numpy.ndarray.add() - elementwise addition. Tries to speedup by checking for scalars of diagonal matrices on either side of operator

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to add

required

Returns:

Type Description

Matrix: the result of addition

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If no names are shared, an exception is raised. The addition of a scalar does not expand non-zero elements.

Example::

m1 = pyemu.Cov.from_parameter_data(pst)
m1_plus_on1 = m1 + 1.0 #add 1.0 to all non-zero elements
m2 = m1.copy()
m2_plus_m1 = m1 + m2

__getitem__(item)

a very crude overload of object.getitem().

Parameters:

Name Type Description Default
item `object`

something that can be used as an index

required

Returns:

Type Description

Matrix: an object that is a sub-matrix of Matrix

__init(**kwargs)

Jco constructor takes the same arguments as Matrix.

Parameters:

Name Type Description Default
**kwargs `dict`

constructor arguments for Matrix

{}

Example:

jco = pyemu.Jco.from_binary("my.jco")

__mul__(other)

Dot product multiplication overload. Tries to speedup by checking for scalars or diagonal matrices on either side of operator

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to dot product

required

Returns:

Type Description

Matrix: the result of dot product

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are found, an exception is raised

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
cov = pyemu.Cov.from_parmaeter_data(pst)
# get the forecast prior covariance matrix
forecast_cov = jco.get(pst.forecast_names).T * cov * jco.get(pst.forecast_names)

__pow__(power)

overload of numpy.ndarray.pow() operator

Parameters:

Name Type Description Default
power `float`

interpreted as follows: -1 = inverse of self, -0.5 = sqrt of inverse of self, 0.5 = sqrt of self. All other positive ints = elementwise self raised to power

required

Returns:

Type Description

Matrix: a new Matrix object raised to the power power

Example::

cov = pyemu.Cov.from_uncfile("my.unc")
sqcov = cov**2

__rmul__(other)

Reverse order Dot product multiplication overload.

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to dot product

required

Returns:

Type Description

Matrix: the result of dot product

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are found, an exception is raised

Example::

# multiply by a scalar
jco = pyemu.Jco.from_binary("pest.jcb")
jco_times_10 = 10 * jco

__set_svd()

private method to set SVD components.

Note: this should not be called directly

__str__()

overload of object.str()

Returns:

Type Description

str: string representation

__sub__(other)

numpy.ndarray.sub() overload. Tries to speedup by checking for scalars of diagonal matrices on either side of operator

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to subtract

required

Returns:

Type Description

Matrix: the result of subtraction

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If no names are shared, an exception is raised

Example::

jco1 = pyemu.Jco.from_binary("pest.1.jcb")
jco2 = pyemu.Jco.from_binary("pest.2.jcb")
diff = jco1 - jco2

align(names, axis=None)

reorder Matrix by names in place. If axis is None, reorder both indices

Parameters:

Name Type Description Default
names [str]

names in Matrix.row_names and or Matrix.col_names

required
axis `int`

the axis to reorder. if None, reorder both axes

None
Note

Works in place. Is called programmatically during linear algebra operations

Example::

# load a jco that has more obs (rows) than non-zero weighted obs
# in the control file
jco = pyemu.Jco.from_binary("pest.jcb")
# get an obs noise cov matrix
obscov = pyemu.Cov.from_observation_data(pst)
jco.align(obscov.row_names,axis=0)

copy()

get a copy of Matrix

Returns:

Type Description

Matrix: copy of this Matrix

df()

wrapper of Matrix.to_dataframe()

drop(names, axis)

drop elements from Matrix in place

Parameters:

Name Type Description Default
names [str]

list of names to drop

required
axis `int`

the axis to drop from. must be in [0,1]

required

element_isaligned(other)

check if matrices are aligned for element-wise operations

Parameters:

Name Type Description Default
other `Matrix`

the other matrix to check for alignment with

required

Returns:

Type Description

bool: True if aligned, False if not aligned

extend(other)

extend Matrix with the elements of other, returning a new matrix.

Args: other (Matrix): the Matrix to extend self by

Returns:

Type Description

Matrix: new, extended Matrix

Note

No row or column names can be shared between self and other

Example::

jco1 = pyemu.Jco.from_binary("pest_history.jco")
jco2 = pyemu.Jco.from_binary("pest_forecast.jco")

jco_ext = jco1.extend(jco2)

extract(row_names=None, col_names=None)

wrapper method that Matrix.gets() then Matrix.drops() elements. one of row_names or col_names must be not None.

Parameters:

Name Type Description Default
row_names [str]

row_names to extract. If None, all row_names are retained.

None
col_names [str]

col_names to extract. If None, all col_names are retained.

None

Returns:

Type Description

Matrix: the extract sub-matrix defined by row_names and/or col_names

Example::

cov = pyemu.Cov.from_parameter_data(pst)
hk_cov = cov.extract(row_names=["hk1","hk2","hk3"])

find_rowcol_indices(names, row_names, col_names, axis=None) staticmethod

fast(er) look of row and column names indices

Parameters:

Name Type Description Default
names [`str`]

list of names to look for in row_names and/or col_names names

required
row_names [`str`]

list of row names

required
col_names [`str`]

list of column names

required
axis `int`

axis to search along. If None, search both. Default is None

None

Returns:

Type Description

numpy.ndarray: array of (integer) index locations. If axis is

None, a 2 numpy.ndarrays of both row and column name indices is returned

from_ascii(filename) classmethod

load a PEST-compatible ASCII matrix/vector file into a Matrix instance

Parameters:

Name Type Description Default
filename `str`

name of the file to read

required

Returns:

Type Description

Matrix: Matrix loaded from ASCII file

Example::

mat = pyemu.Matrix.from_ascii("my.mat")
cov = pyemi.Cov.from_ascii("my.cov")

from_binary(filename, forgive=False) classmethod

class method load from PEST-compatible binary file into a Matrix instance

Parameters:

Name Type Description Default
filename `str`

filename to read

required
forgive `bool`

flag to forgive incomplete data records. Only applicable to dense binary format. Default is False

False

Returns:

Type Description

Matrix: Matrix loaded from binary file

Example::

mat = pyemu.Matrix.from_binary("my.jco")
cov = pyemu.Cov.from_binary("large_cov.jcb")

from_dataframe(df) classmethod

class method to create a new Matrix instance from a pandas.DataFrame

Parameters:

Name Type Description Default
df `pandas.DataFrame`

dataframe

required

Returns:

Type Description

Matrix: Matrix instance derived from df.

Example::

df = pd.read_csv("my.csv")
mat = pyemu.Matrix.from_dataframe(df)

from_fortranfile(filename) staticmethod

a binary load method to accommodate one of the many bizarre fortran binary writing formats

Parameters:

Name Type Description Default
filename `str`

name of the binary matrix file

required

Returns:

Type Description

tuple containing

  • numpy.ndarray: the numeric values in the file
  • ['str']: list of row names
  • [str]: list of col_names

from_names(row_names, col_names, isdiagonal=False, autoalign=True, random=False) classmethod

class method to create a new Matrix instance from row names and column names, filled with trash

Parameters:

Name Type Description Default
row_names [str]

row names for the new Matrix

required
col_names [str]

col_names for the new matrix

required
isdiagonal `bool`

flag for diagonal matrix. Default is False

False
autoalign `bool`

flag for autoaligning new matrix during linear algebra calcs. Default is True

True
random `bool`

flag for contents of the trash matrix. If True, fill with random numbers, if False, fill with zeros Default is False

False

Returns:

Type Description

Matrix: the new Matrix instance

Example::

row_names = ["row_1","row_2"]
col_names = ["col_1,"col_2"]
m = pyemu.Matrix.from_names(row_names,col_names)

from_pst(pst, random=False) classmethod

construct a new empty Jco from a control file optionally filled with trash

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a pest control file instance. If type is 'str', pst is loaded from filename

required
random `bool`

flag for contents of the trash matrix. If True, fill with random numbers, if False, fill with zeros Default is False

False

Returns:

Type Description

Jco: the new Jco instance

get(row_names=None, col_names=None, drop=False)

get a new Matrix instance ordered on row_names or col_names

Parameters:

Name Type Description Default
row_names [str]

row_names for new Matrix. If None, all row_names are used.

None
col_names [str]

col_names for new Matrix. If None, all col_names are used.

None
drop `bool`

flag to remove row_names and/or col_names from this Matrix

False

Returns:

Type Description

Matrix: a new Matrix

Example:: # load a jco that has more obs (rows) than non-zero weighted obs # in the control file jco = pyemu.Jco.from_binary("pest.jcb") # get an obs noise cov matrix obscov = pyemu.Cov.from_observation_data(pst) nnz_jco = jco.get(row_names = obscov.row_names)

get_dense_binary_info(filename) staticmethod

read the header and row and offsets for a dense binary file.

Parameters
filename (`str`): dense binary filename

Returns:

Type Description

tuple containing

  • ['str']: list of row names
  • ['int']: list of row offsets
  • [str]: list of col names
  • bool: flag indicating successful reading of all records found

get_diagonal_vector(col_name='diag')

Get a new Matrix instance that is the diagonal of self. The shape of the new matrix is (self.shape[0],1). Self must be square

Parameters:

Name Type Description Default
col_name `str`

the name of the single column in the new Matrix

'diag'

Returns:

Type Description

Matrix: vector-shaped Matrix instance of the diagonal of this Matrix

Example::

cov = pyemu.Cov.from_unc_file("my.unc")
cov_diag = cov.get_diagonal_vector()
print(cov_diag.col_names)

get_maxsing(eigthresh=1e-05)

Get the number of singular components with a singular value ratio greater than or equal to eigthresh

Args: eigthresh (float): the ratio of smallest to largest singular value to retain. Since it is assumed that s is sorted from largest to smallest, once a singular value is reached that yields a ratio with the first (largest) singular value, the index of this singular is returned.

Returns:

Type Description

int: the index of the singular value who's ratio with the

first singular value is less than or equal to eigthresh

Note

this method calls the static method Matrix.get_maxsing_from_s() with Matrix.s.x

Example::

jco = pyemu.Jco.from_binary("pest.jco")
max_sing = jco.get_maxsing(eigthresh=pst.svd_data.eigthresh)

get_maxsing_from_s(s, eigthresh=1e-05) staticmethod

static method to work out the maxsing for a given singular spectrum

Parameters:

Name Type Description Default
s `numpy.ndarray`

1-D array of singular values. This array should come from calling either numpy.linalg.svd or from the pyemu.Matrix.s.x attribute

required
eigthresh `float`

the ratio of smallest to largest singular value to retain. Since it is assumed that s is sorted from largest to smallest, once a singular value is reached that yields a ratio with the first (largest) singular value, the index of this singular is returned.

1e-05

Returns:

Type Description

int: the index of the singular value who's ratio with the

first singular value is less than or equal to eigthresh

Example::

jco = pyemu.Jco.from_binary("pest.jco")
max_sing = pyemu.Matrix.get_maxsing_from_s(jco.s,eigthresh=pst.svd_data.eigthresh)

hadamard_product(other)

Overload of numpy.ndarray.mult(): element-wise multiplication. Tries to speedup by checking for scalars of diagonal matrices on either side of operator

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to multiply

required

Returns:

Type Description

Matrix: the result of multiplication

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are shared, an exception is raised

Example::

cov = pyemu.Cov.from_parameter_data(pst)
cov2 = cov * 10
cov3 = cov * cov2

indices(names, axis=None)

get the row and col indices of names. If axis is None, two ndarrays are returned, corresponding the indices of names for each axis

Parameters:

Name Type Description Default
names [`str`]

list of names to look for in row_names and/or col_names names

required
row_names [`str`]

list of row names

required
col_names [`str`]

list of column names

required
axis `int`

axis to search along. If None, search both. Default is None

None

Returns:

Type Description

numpy.ndarray: array of (integer) index locations. If axis is

None, a 2 numpy.ndarrays of both row and column name indices is returned

Note

thin wrapper around Matrix.find_rowcol_indices static method

mult_isaligned(other)

check if matrices are aligned for dot product multiplication

Parameters:

Name Type Description Default
other `Matrix`

the other matrix to check for alignment with

required

Returns:

Type Description

bool: True if aligned, False if not aligned

pseudo_inv(maxsing=None, eigthresh=1e-05)

The pseudo inverse of self. Formed using truncated singular value decomposition and Matrix.pseudo_inv_components

Parameters:

Name Type Description Default
maxsing `int`

the number of singular components to use. If None, maxsing is calculated using Matrix.get_maxsing() and eigthresh

None
`eigthresh`

(float, optional): the ratio of largest to smallest singular components to use for truncation. Ignored if maxsing is not None. Default is 1.0e-5

required

Returns:

Type Description

Matrix: the truncated-SVD pseudo inverse of Matrix (V_1 * s_1^-1 * U^T)

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
jco_psinv = jco.pseudo_inv(pst.svd_data.maxsing,pst.svd_data.eigthresh)
jco_psinv.to_binary("pseudo_inv.jcb")

pseudo_inv_components(maxsing=None, eigthresh=1e-05, truncate=True)

Get the (optionally) truncated SVD components

Parameters:

Name Type Description Default
maxsing `int`

the number of singular components to use. If None, maxsing is calculated using Matrix.get_maxsing() and eigthresh

None
`eigthresh`

(float, optional): the ratio of largest to smallest singular components to use for truncation. Ignored if maxsing is not None. Default is 1.0e-5

required
truncate `bool`

flag to truncate components. If False, U, s, and V will be zeroed out at locations greater than maxsing instead of truncated. Default is True

True

Returns:

Type Description

tuple containing

  • Matrix: (optionally truncated) left singular vectors
  • Matrix: (optionally truncated) singular value matrix
  • Matrix: (optionally truncated) right singular vectors

Example::

mat = pyemu.Matrix.from_binary("my.jco")
u1,s1,v1 = mat.pseudo_inv_components(maxsing=10)
resolution_matrix = v1 * v1.T
resolution_matrix.to_ascii("resol.mat")

read_ascii(filename) staticmethod

read a PEST-compatible ASCII matrix/vector file

Parameters:

Name Type Description Default
filename `str`

file to read from

required

Returns:

Type Description

tuple containing

  • numpy.ndarray: numeric values
  • ['str']: list of row names
  • [str]: list of column names
  • bool: diagonal flag

read_binary(filename, forgive=False) staticmethod

static method to read PEST-format binary files

Parameters:

Name Type Description Default
filename `str`

filename to read

required
forgive `bool`

flag to forgive incomplete data records. Only applicable to dense binary format. Default is False

False

Returns:

Type Description

tuple containing

  • numpy.ndarray: the numeric values in the file
  • ['str']: list of row names
  • [str]: list of col_names

read_binary_header(filename) staticmethod

read the first elements of a PEST(++)-style binary file to get format and dimensioning information.

Parameters:

Name Type Description Default
filename `str`

the filename of the binary file

required

Returns:

Type Description

tuple containing

  • int: the itemp1 value
  • int: the itemp2 value
  • int: the icount value

read_dense(filename, forgive=False, close=True, only_rows=None) staticmethod

read a dense-format binary file.

Parameters:

Name Type Description Default
filename `str`

the filename or the open filehandle

required
forgive `bool`

flag to forgive incomplete records. If True and an incomplete record is encountered, only the previously read records are returned. If False, an exception is raised for an incomplete record

False
close `bool`

flag to close the filehandle. Default is True

True
only_rows `iterable`

rows to read. If None, all rows are read

None

Returns:

Type Description

tuple containing

  • numpy.ndarray: the numeric values in the file
  • ['str']: list of row names
  • [str]: list of col_names

reset_x(x, copy=True)

reset self.__x private attribute

Parameters:

Name Type Description Default
x `numpy.ndarray`

the new numeric data

required
copy `bool`

flag to make a copy of 'x'. Default is True

True

Returns:

Type Description

None

Note

operates in place

to_2d()

get a 2D Matrix representation of Matrix. If not Matrix.isdiagonal, simply return a copy of Matrix, otherwise, constructs and returns a new Matrix instance that is stored as diagonal

Returns:

Type Description

Martrix: non-diagonal form of Matrix

Example::

# A diagonal cov
cov = pyemu.Cov.from_parameter_data
cov2d = cov.as_2d # a numpy 2d array
print(cov.shape,cov.x.shape,cov2d.shape,cov2d.x.shape)

to_ascii(filename, icode=2)

write a PEST-compatible ASCII Matrix/vector file

Parameters:

Name Type Description Default
filename `str`

filename to write to

required
icode `int`

PEST-style info code for matrix style. Default is 2.

2

Note: if icode == -1, a 1-d vector is written that represents a diagonal matrix. An exception is raised if icode == -1 and isdiagonal is False

to_binary(filename, droptol=None, chunk=None)

write a PEST-compatible binary file. The format is the same as the format used to storage a PEST Jacobian matrix

Parameters:

Name Type Description Default
filename `str`

filename to save binary file

required
droptol `float`

absolute value tolerance to make values smaller droptol than zero. Default is None (no dropping)

None
chunk `int`

number of elements to write in a single pass. Default is None, which writes the entire numeric part of the Matrix at once. This is faster but requires more memory.

None

to_coo(filename, droptol=None, chunk=None)

write an extended PEST-format binary file. The data format is [int,int,float] for i,j,value. It is autodetected during the read with Matrix.from_binary().

Parameters:

Name Type Description Default
filename `str` or `Path`

filename to save binary file

required
droptol `float`

absolute value tolerance to make values smaller droptol than zero. Default is None (no dropping)

None
chunk `int`

number of elements to write in a single pass. Default is None, which writes the entire numeric part of the Matrix at once. This is faster but requires more memory.

None
Note

This method is needed when the number of dimensions times 2 is larger than the max value for a 32-bit integer. happens! This method is used by pyemu.Ensemble.to_binary()

to_dataframe()

return a pandas.DataFrame representation of Matrix

Returns:

Type Description

pandas.DataFrame: a dataframe derived from Matrix

Note

if self.isdiagonal is True, the full matrix is used to fill the dataframe - lots of zeros.

to_dense(filename, close=True)

experimental new dense matrix storage format to support faster I/O with ensembles

Parameters:

Name Type Description Default
filename `str`

the filename to save to

required
close `bool`

flag to close the filehandle after saving

True

Returns:

Name Type Description
f `file`

the file handle. Only returned if close is False

Note

calls Matrix.write_dense()

write_dense(filename, row_names, col_names, data, close=False) staticmethod

experimental new dense matrix storage format to support faster I/O with ensembles

Parameters:

Name Type Description Default
filename `str` or `file`

the file to write to

required
row_names [`str`]

row names of the matrix

required
col_names [`str`]

col names of the matrix

required
data `np.ndarray`

matrix elements

required
close `bool`

flag to close the file after writing

False

Returns:

Name Type Description
f `file`

the file handle. Only returned if close is False

LinearAnalysis

Bases: object

The base/parent class for linear analysis.

Parameters:

Name Type Description Default
jco varies

something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.

None
pst varies

something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)

None
parcov varies

prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (".jcb"/".jco" for binary, ".cov"/".mat" for PEST-style ASCII matrix, or ".unc" for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance

None
obscov varies

observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (".jcb"/".jco" for binary, ".cov"/".mat" for PEST-style ASCII matrix, or ".unc" for uncertainty files). If None, the noise covariance matrix is constructed from the observation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance

None
forecasts varies

forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection of pyemu.Matrix or numpy.ndarray.

None
ref_var float

reference variance. Default is 1.0

1.0
verbose `bool`

controls screen output. If str, a filename is assumed and and log file is written.

False
sigma_range `float`

defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.

4.0
scale_offset `bool`

flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.

True

Note:

Can be used directly, but for prior uncertainty analyses only.

The derived types (`pyemu.Schur`, `pyemu.ErrVar`) are for different
forms of FOMS-based posterior uncertainty analyses.

This class tries hard to not load items until they are needed; all arguments are optional.

The class makes heavy use of property decorator to encapsulated private attributes

Example::

#assumes "my.pst" exists
la = pyemu.LinearAnalysis(jco="my.jco",forecasts=["fore1","fore2"])
print(la.prior_forecast)

adj_par_names property

adjustable parameter names

Returns:

Type Description

['str`]: list of adjustable parameter names

Note

if LinearAnalysis.pst is None, returns LinearAnalysis.jco.col_names

fehalf property

Karhunen-Loeve scaling matrix attribute.

Returns:

Type Description

pyemu.Matrix: the Karhunen-Loeve scaling matrix based on the prior

parameter covariance matrix

forecast_names property

get the forecast (aka prediction) names

Returns:

Type Description
[`str`]

list of forecast names

forecasts property

the forecast sentivity vectors attribute

Returns:

Type Description

pyemu.Matrix: a matrix of forecast (prediction) sensitivity vectors (column wise)

forecasts_iter property

forecast (e.g. prediction) sensitivity vectors iterator

Returns:

Type Description

iterator: iterator on forecasts (e.g. predictions) sensitivity vectors (matrix)

Note

This is used for processing huge numbers of predictions

This is a synonym for LinearAnalysis.predictions_iter()

jco property

the jacobian matrix attribute

Returns:

Type Description

pyemu.Jco: the jacobian matrix attribute

mle_covariance property

maximum likelihood parameter covariance matrix.

Returns:

Type Description

pyemu.Matrix: maximum likelihood parameter covariance matrix

mle_parameter_estimate property

maximum likelihood parameter estimate.

Returns:

Type Description

pandas.Series: the maximum likelihood parameter estimates

nnz_obs_names property

non-zero-weighted observation names

Returns:

Type Description

['str`]: list of non-zero-weighted observation names

Note

if LinearAnalysis.pst is None, returns LinearAnalysis.jco.row_names

obscov property

get the observation noise covariance matrix attribute

Returns:

Type Description

pyemu.Cov: a reference to the LinearAnalysis.obscov attribute

parcov property

get the prior parameter covariance matrix attribute

Returns:

Type Description

pyemu.Cov: a reference to the LinearAnalysis.parcov attribute

predictions property

the prediction (aka forecast) sentivity vectors attribute

Returns:

Type Description

pyemu.Matrix: a matrix of prediction sensitivity vectors (column wise)

predictions_iter property

prediction sensitivity vectors iterator

Returns:

Type Description

iterator: iterator on prediction sensitivity vectors (matrix)

Note

this is used for processing huge numbers of predictions

prior_forecast property

prior forecast (e.g. prediction) variances

Returns:

Type Description

dict: a dictionary of forecast name, prior variance pairs

prior_parameter property

prior parameter covariance matrix

Returns:

Type Description

pyemu.Cov: prior parameter covariance matrix

prior_prediction property

prior prediction (e.g. forecast) variances

Returns:

Type Description

dict: a dictionary of prediction name, prior variance pairs

pst property

the pst attribute

Returns:

Type Description

pyemu.Pst: the pst attribute

qhalf property

square root of the cofactor matrix attribute. Create the attribute if it has not yet been created

Returns:

Type Description

pyemu.Matrix: square root of the cofactor matrix

qhalfx property

half normal matrix attribute.

Returns:

Type Description

pyemu.Matrix: half normal matrix attribute

xtqx property

normal matrix attribute.

Returns:

Type Description

pyemu.Matrix: normal matrix attribute

__fromfile(filename, astype=None)

a private method to deduce and load a filename into a matrix object. Uses extension: 'jco' or 'jcb': binary, 'mat','vec' or 'cov': ASCII, 'unc': pest uncertainty file.

__load_jco()

private method to set the jco attribute from a file or a matrix object

__load_obscov()

private method to set the obscov attribute from: a pest control file (observation weights) a pst object a matrix object an uncert file an ascii matrix file

__load_parcov()

private method to set the parcov attribute from: a pest control file (parameter bounds) a pst object a matrix object an uncert file an ascii matrix file

__load_predictions()

private method set the predictions attribute from

mixed list of row names, matrix files and ndarrays a single row name an ascii file

can be none if only interested in parameters.

__load_pst()

private method set the pst attribute

adjust_obscov_resfile(resfile=None)

reset the elements of obscov by scaling the implied weights based on the phi components in res_file so that the total phi is equal to the number of non-zero weights.

Parameters:

Name Type Description Default
resfile `str`

residual file to use. If None, residual file with case name is sought. default is None

None
Note

calls pyemu.Pst.adjust_weights_resfile()

apply_karhunen_loeve_scaling()

apply karhuene-loeve scaling to the jacobian matrix.

Note:

This scaling is not necessary for analyses using Schur's
complement, but can be very important for error variance
analyses.  This operation effectively transfers prior knowledge
specified in the parcov to the jacobian and reset parcov to the
identity matrix.

clean()

drop regularization and prior information observation from the jco

drop_prior_information()

drop the prior information from the jco and pst attributes

get(par_names=None, obs_names=None, astype=None)

method to get a new LinearAnalysis class using a subset of parameters and/or observations

Parameters:

Name Type Description Default
par_names [`'str`]

par names for new object

None
obs_names [`'str`]

obs names for new object

None
astype `pyemu.Schur` or `pyemu.ErrVar`

type to cast the new object. If None, return type is same as self

None

Returns:

Type Description

LinearAnalysis: new instance

get_cso_dataframe()

get a dataframe of composite observation sensitivity, as returned by PEST in the seo file.

Returns:

Type Description

pandas.DataFrame: dataframe of observation names and composite observation

sensitivity

Note

That this formulation deviates slightly from the PEST documentation in that the values are divided by (npar-1) rather than by (npar).

The equation is cso_j = ((Q^1/2JJ^T*Q^1/2)^1/2)_jj/(NPAR-1)

get_obs_competition_dataframe()

get the observation competition stat a la PEST utility

Returns:

Type Description

pandas.DataFrame: a dataframe of observation names by

observation names with values equal to the PEST

competition statistic

get_par_css_dataframe()

get a dataframe of composite scaled sensitivities. Includes both PEST-style and Hill-style.

Returns:

Type Description

pandas.DataFrame: a dataframe of parameter names, PEST-style and

Hill-style composite scaled sensitivity

reset_obscov(arg=None)

reset the obscov attribute to None

Parameters:

Name Type Description Default
arg `str` or `pyemu.Matrix`

the value to assign to the obscov attribute. If None, the private __obscov attribute is cleared but not reset

None

reset_parcov(arg=None)

reset the parcov attribute to None

Parameters:

Name Type Description Default
arg `str` or `pyemu.Matrix`

the value to assign to the parcov attribute. If None, the private __parcov attribute is cleared but not reset

None

reset_pst(arg)

reset the LinearAnalysis.pst attribute

Parameters:

Name Type Description Default
arg `str` or `pyemu.Pst`

the value to assign to the pst attribute

required

Matrix

Bases: object

Easy linear algebra in the PEST(++) realm

Parameters:

Name Type Description Default
x `numpy.ndarray`

numeric values

None
row_names [`str`]

list of row names

[]
col_names [str]

list of column names

[]
isdigonal `bool`

flag if the Matrix is diagonal

required
autoalign `bool`

flag to control the autoalignment of Matrix during linear algebra operations

True

Example::

data = pyemu.en.rng.random((10,10))
row_names = ["row_{0}".format(i) for i in range(10)]
col_names = ["col_{0}".format(j) for j in range(10)]
mat = pyemu.Matrix(x=data,row_names=row_names,col_names=col_names)
mat.to_binary("mat.jco")

# load an existing jacobian matrix
jco = pyemu.Jco.from_binary("pest.jco")
# form an observation noise covariance matrix from weights
obscov = pyemu.Cov.from_observation_data(pst)
# form the normal matrix, aligning rows and cols on-the-fly
xtqx = jco * obscov.inv * jco.T
Note

this class makes heavy use of property decorators to encapsulate private attributes

T property

wrapper function for Matrix.transpose() method

Returns:

Type Description

Matrix: transpose of Matrix

Note

returns a copy of self

A syntatic-sugar overload of Matrix.transpose()

Example::

jcb = pyemu.Jco.from_binary("pest.jcb")
jcbt = jcb.T

as_2d property

get a 2D numeric representation of Matrix.x. If not isdiagonal, simply return reference to Matrix.x, otherwise, constructs and returns a 2D, diagonal ndarray

Returns:

Type Description

numpy.ndarray : numpy.ndarray

Example::

# A diagonal cov
cov = pyemu.Cov.from_parameter_data
x2d = cov.as_2d # a numpy 2d array
print(cov.shape,cov.x.shape,x2d.shape)

full_s property

Get the full singular value matrix

Returns:

Type Description

Matrix: full singular value matrix. Shape is (max(Matrix.shape),max(Matrix.shape))

with zeros along the diagonal from min(Matrix.shape) to max(Matrix.shape)

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
jco.full_s.to_ascii("full_sing_val_matrix.mat")

inv property

inversion operation of Matrix

Returns:

Type Description

Matrix: inverse of Matrix

Note

uses numpy.linalg.inv for the inversion

Example::

mat = pyemu.Matrix.from_binary("my.jco")
mat_inv = mat.inv
mat_inv.to_binary("my_inv.jco")

ncol property

length of second dimension

Returns:

Type Description

int: number of columns

newx property

return a copy of Matrix.x attribute

Returns:

Type Description

numpy.ndarray: a copy Matrix.x

nrow property

length of first dimension

Returns:

Type Description

int: number of rows

s property

the singular value (diagonal) Matrix

Returns:

Type Description

Matrix: singular value matrix. shape is (min(Matrix.shape),min(Matrix.shape))

Example::

# plot the singular spectrum of the jacobian
import matplotlib.pyplot as plt
jco = pyemu.Jco.from_binary("pest.jcb")
plt.plot(jco.s.x)
plt.show()

shape property

get the implied, 2D shape of Matrix

Returns:

Type Description

int: length of 2 tuple

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
shape = jco.shape
nrow,ncol = shape #unpack to ints

sqrt property

element-wise square root operation

Returns:

Type Description

Matrix: element-wise square root of Matrix

Note

uses numpy.sqrt

Example::

cov = pyemu.Cov.from_uncfile("my.unc")
sqcov = cov.sqrt
sqcov.to_uncfile("sqrt_cov.unc")

transpose property

transpose operation of self

Returns:

Type Description

Matrix: transpose of Matrix

Example::

jcb = pyemu.Jco.from_binary("pest.jcb")
jcbt = jcb.T

u property

the left singular vector Matrix

Returns:

Type Description

Matrix: left singular vectors. Shape is (Matrix.shape[0], Matrix.shape[0])

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
jco.u.to_binary("u.jcb")

v property

the right singular vector Matrix

Returns:

Type Description

Matrix: right singular vectors. Shape is (Matrix.shape[1], Matrix.shape[1])

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
jco.v.to_binary("v.jcb")

x property

return a reference to Matrix.x

Returns:

Type Description

numpy.ndarray: reference to Matrix.x

zero2d property

get an 2D instance of self with all zeros

Returns:

Type Description

Matrix: Matrix of zeros

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
zero_jco = jco.zero2d

__add__(other)

Overload of numpy.ndarray.add() - elementwise addition. Tries to speedup by checking for scalars of diagonal matrices on either side of operator

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to add

required

Returns:

Type Description

Matrix: the result of addition

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If no names are shared, an exception is raised. The addition of a scalar does not expand non-zero elements.

Example::

m1 = pyemu.Cov.from_parameter_data(pst)
m1_plus_on1 = m1 + 1.0 #add 1.0 to all non-zero elements
m2 = m1.copy()
m2_plus_m1 = m1 + m2

__getitem__(item)

a very crude overload of object.getitem().

Parameters:

Name Type Description Default
item `object`

something that can be used as an index

required

Returns:

Type Description

Matrix: an object that is a sub-matrix of Matrix

__mul__(other)

Dot product multiplication overload. Tries to speedup by checking for scalars or diagonal matrices on either side of operator

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to dot product

required

Returns:

Type Description

Matrix: the result of dot product

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are found, an exception is raised

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
cov = pyemu.Cov.from_parmaeter_data(pst)
# get the forecast prior covariance matrix
forecast_cov = jco.get(pst.forecast_names).T * cov * jco.get(pst.forecast_names)

__pow__(power)

overload of numpy.ndarray.pow() operator

Parameters:

Name Type Description Default
power `float`

interpreted as follows: -1 = inverse of self, -0.5 = sqrt of inverse of self, 0.5 = sqrt of self. All other positive ints = elementwise self raised to power

required

Returns:

Type Description

Matrix: a new Matrix object raised to the power power

Example::

cov = pyemu.Cov.from_uncfile("my.unc")
sqcov = cov**2

__rmul__(other)

Reverse order Dot product multiplication overload.

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to dot product

required

Returns:

Type Description

Matrix: the result of dot product

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are found, an exception is raised

Example::

# multiply by a scalar
jco = pyemu.Jco.from_binary("pest.jcb")
jco_times_10 = 10 * jco

__set_svd()

private method to set SVD components.

Note: this should not be called directly

__str__()

overload of object.str()

Returns:

Type Description

str: string representation

__sub__(other)

numpy.ndarray.sub() overload. Tries to speedup by checking for scalars of diagonal matrices on either side of operator

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to subtract

required

Returns:

Type Description

Matrix: the result of subtraction

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If no names are shared, an exception is raised

Example::

jco1 = pyemu.Jco.from_binary("pest.1.jcb")
jco2 = pyemu.Jco.from_binary("pest.2.jcb")
diff = jco1 - jco2

align(names, axis=None)

reorder Matrix by names in place. If axis is None, reorder both indices

Parameters:

Name Type Description Default
names [str]

names in Matrix.row_names and or Matrix.col_names

required
axis `int`

the axis to reorder. if None, reorder both axes

None
Note

Works in place. Is called programmatically during linear algebra operations

Example::

# load a jco that has more obs (rows) than non-zero weighted obs
# in the control file
jco = pyemu.Jco.from_binary("pest.jcb")
# get an obs noise cov matrix
obscov = pyemu.Cov.from_observation_data(pst)
jco.align(obscov.row_names,axis=0)

copy()

get a copy of Matrix

Returns:

Type Description

Matrix: copy of this Matrix

df()

wrapper of Matrix.to_dataframe()

drop(names, axis)

drop elements from Matrix in place

Parameters:

Name Type Description Default
names [str]

list of names to drop

required
axis `int`

the axis to drop from. must be in [0,1]

required

element_isaligned(other)

check if matrices are aligned for element-wise operations

Parameters:

Name Type Description Default
other `Matrix`

the other matrix to check for alignment with

required

Returns:

Type Description

bool: True if aligned, False if not aligned

extend(other)

extend Matrix with the elements of other, returning a new matrix.

Args: other (Matrix): the Matrix to extend self by

Returns:

Type Description

Matrix: new, extended Matrix

Note

No row or column names can be shared between self and other

Example::

jco1 = pyemu.Jco.from_binary("pest_history.jco")
jco2 = pyemu.Jco.from_binary("pest_forecast.jco")

jco_ext = jco1.extend(jco2)

extract(row_names=None, col_names=None)

wrapper method that Matrix.gets() then Matrix.drops() elements. one of row_names or col_names must be not None.

Parameters:

Name Type Description Default
row_names [str]

row_names to extract. If None, all row_names are retained.

None
col_names [str]

col_names to extract. If None, all col_names are retained.

None

Returns:

Type Description

Matrix: the extract sub-matrix defined by row_names and/or col_names

Example::

cov = pyemu.Cov.from_parameter_data(pst)
hk_cov = cov.extract(row_names=["hk1","hk2","hk3"])

find_rowcol_indices(names, row_names, col_names, axis=None) staticmethod

fast(er) look of row and column names indices

Parameters:

Name Type Description Default
names [`str`]

list of names to look for in row_names and/or col_names names

required
row_names [`str`]

list of row names

required
col_names [`str`]

list of column names

required
axis `int`

axis to search along. If None, search both. Default is None

None

Returns:

Type Description

numpy.ndarray: array of (integer) index locations. If axis is

None, a 2 numpy.ndarrays of both row and column name indices is returned

from_ascii(filename) classmethod

load a PEST-compatible ASCII matrix/vector file into a Matrix instance

Parameters:

Name Type Description Default
filename `str`

name of the file to read

required

Returns:

Type Description

Matrix: Matrix loaded from ASCII file

Example::

mat = pyemu.Matrix.from_ascii("my.mat")
cov = pyemi.Cov.from_ascii("my.cov")

from_binary(filename, forgive=False) classmethod

class method load from PEST-compatible binary file into a Matrix instance

Parameters:

Name Type Description Default
filename `str`

filename to read

required
forgive `bool`

flag to forgive incomplete data records. Only applicable to dense binary format. Default is False

False

Returns:

Type Description

Matrix: Matrix loaded from binary file

Example::

mat = pyemu.Matrix.from_binary("my.jco")
cov = pyemu.Cov.from_binary("large_cov.jcb")

from_dataframe(df) classmethod

class method to create a new Matrix instance from a pandas.DataFrame

Parameters:

Name Type Description Default
df `pandas.DataFrame`

dataframe

required

Returns:

Type Description

Matrix: Matrix instance derived from df.

Example::

df = pd.read_csv("my.csv")
mat = pyemu.Matrix.from_dataframe(df)

from_fortranfile(filename) staticmethod

a binary load method to accommodate one of the many bizarre fortran binary writing formats

Parameters:

Name Type Description Default
filename `str`

name of the binary matrix file

required

Returns:

Type Description

tuple containing

  • numpy.ndarray: the numeric values in the file
  • ['str']: list of row names
  • [str]: list of col_names

from_names(row_names, col_names, isdiagonal=False, autoalign=True, random=False) classmethod

class method to create a new Matrix instance from row names and column names, filled with trash

Parameters:

Name Type Description Default
row_names [str]

row names for the new Matrix

required
col_names [str]

col_names for the new matrix

required
isdiagonal `bool`

flag for diagonal matrix. Default is False

False
autoalign `bool`

flag for autoaligning new matrix during linear algebra calcs. Default is True

True
random `bool`

flag for contents of the trash matrix. If True, fill with random numbers, if False, fill with zeros Default is False

False

Returns:

Type Description

Matrix: the new Matrix instance

Example::

row_names = ["row_1","row_2"]
col_names = ["col_1,"col_2"]
m = pyemu.Matrix.from_names(row_names,col_names)

get(row_names=None, col_names=None, drop=False)

get a new Matrix instance ordered on row_names or col_names

Parameters:

Name Type Description Default
row_names [str]

row_names for new Matrix. If None, all row_names are used.

None
col_names [str]

col_names for new Matrix. If None, all col_names are used.

None
drop `bool`

flag to remove row_names and/or col_names from this Matrix

False

Returns:

Type Description

Matrix: a new Matrix

Example:: # load a jco that has more obs (rows) than non-zero weighted obs # in the control file jco = pyemu.Jco.from_binary("pest.jcb") # get an obs noise cov matrix obscov = pyemu.Cov.from_observation_data(pst) nnz_jco = jco.get(row_names = obscov.row_names)

get_dense_binary_info(filename) staticmethod

read the header and row and offsets for a dense binary file.

Parameters
filename (`str`): dense binary filename

Returns:

Type Description

tuple containing

  • ['str']: list of row names
  • ['int']: list of row offsets
  • [str]: list of col names
  • bool: flag indicating successful reading of all records found

get_diagonal_vector(col_name='diag')

Get a new Matrix instance that is the diagonal of self. The shape of the new matrix is (self.shape[0],1). Self must be square

Parameters:

Name Type Description Default
col_name `str`

the name of the single column in the new Matrix

'diag'

Returns:

Type Description

Matrix: vector-shaped Matrix instance of the diagonal of this Matrix

Example::

cov = pyemu.Cov.from_unc_file("my.unc")
cov_diag = cov.get_diagonal_vector()
print(cov_diag.col_names)

get_maxsing(eigthresh=1e-05)

Get the number of singular components with a singular value ratio greater than or equal to eigthresh

Args: eigthresh (float): the ratio of smallest to largest singular value to retain. Since it is assumed that s is sorted from largest to smallest, once a singular value is reached that yields a ratio with the first (largest) singular value, the index of this singular is returned.

Returns:

Type Description

int: the index of the singular value who's ratio with the

first singular value is less than or equal to eigthresh

Note

this method calls the static method Matrix.get_maxsing_from_s() with Matrix.s.x

Example::

jco = pyemu.Jco.from_binary("pest.jco")
max_sing = jco.get_maxsing(eigthresh=pst.svd_data.eigthresh)

get_maxsing_from_s(s, eigthresh=1e-05) staticmethod

static method to work out the maxsing for a given singular spectrum

Parameters:

Name Type Description Default
s `numpy.ndarray`

1-D array of singular values. This array should come from calling either numpy.linalg.svd or from the pyemu.Matrix.s.x attribute

required
eigthresh `float`

the ratio of smallest to largest singular value to retain. Since it is assumed that s is sorted from largest to smallest, once a singular value is reached that yields a ratio with the first (largest) singular value, the index of this singular is returned.

1e-05

Returns:

Type Description

int: the index of the singular value who's ratio with the

first singular value is less than or equal to eigthresh

Example::

jco = pyemu.Jco.from_binary("pest.jco")
max_sing = pyemu.Matrix.get_maxsing_from_s(jco.s,eigthresh=pst.svd_data.eigthresh)

hadamard_product(other)

Overload of numpy.ndarray.mult(): element-wise multiplication. Tries to speedup by checking for scalars of diagonal matrices on either side of operator

Parameters:

Name Type Description Default
other

(int,float,numpy.ndarray,Matrix): the thing to multiply

required

Returns:

Type Description

Matrix: the result of multiplication

Note

if Matrix and other (if applicable) have autoalign set to True, both Matrix and other are aligned based on row and column names. If names are not common between the two, this may result in a smaller returned Matrix. If not common elements are shared, an exception is raised

Example::

cov = pyemu.Cov.from_parameter_data(pst)
cov2 = cov * 10
cov3 = cov * cov2

indices(names, axis=None)

get the row and col indices of names. If axis is None, two ndarrays are returned, corresponding the indices of names for each axis

Parameters:

Name Type Description Default
names [`str`]

list of names to look for in row_names and/or col_names names

required
row_names [`str`]

list of row names

required
col_names [`str`]

list of column names

required
axis `int`

axis to search along. If None, search both. Default is None

None

Returns:

Type Description

numpy.ndarray: array of (integer) index locations. If axis is

None, a 2 numpy.ndarrays of both row and column name indices is returned

Note

thin wrapper around Matrix.find_rowcol_indices static method

mult_isaligned(other)

check if matrices are aligned for dot product multiplication

Parameters:

Name Type Description Default
other `Matrix`

the other matrix to check for alignment with

required

Returns:

Type Description

bool: True if aligned, False if not aligned

pseudo_inv(maxsing=None, eigthresh=1e-05)

The pseudo inverse of self. Formed using truncated singular value decomposition and Matrix.pseudo_inv_components

Parameters:

Name Type Description Default
maxsing `int`

the number of singular components to use. If None, maxsing is calculated using Matrix.get_maxsing() and eigthresh

None
`eigthresh`

(float, optional): the ratio of largest to smallest singular components to use for truncation. Ignored if maxsing is not None. Default is 1.0e-5

required

Returns:

Type Description

Matrix: the truncated-SVD pseudo inverse of Matrix (V_1 * s_1^-1 * U^T)

Example::

jco = pyemu.Jco.from_binary("pest.jcb")
jco_psinv = jco.pseudo_inv(pst.svd_data.maxsing,pst.svd_data.eigthresh)
jco_psinv.to_binary("pseudo_inv.jcb")

pseudo_inv_components(maxsing=None, eigthresh=1e-05, truncate=True)

Get the (optionally) truncated SVD components

Parameters:

Name Type Description Default
maxsing `int`

the number of singular components to use. If None, maxsing is calculated using Matrix.get_maxsing() and eigthresh

None
`eigthresh`

(float, optional): the ratio of largest to smallest singular components to use for truncation. Ignored if maxsing is not None. Default is 1.0e-5

required
truncate `bool`

flag to truncate components. If False, U, s, and V will be zeroed out at locations greater than maxsing instead of truncated. Default is True

True

Returns:

Type Description

tuple containing

  • Matrix: (optionally truncated) left singular vectors
  • Matrix: (optionally truncated) singular value matrix
  • Matrix: (optionally truncated) right singular vectors

Example::

mat = pyemu.Matrix.from_binary("my.jco")
u1,s1,v1 = mat.pseudo_inv_components(maxsing=10)
resolution_matrix = v1 * v1.T
resolution_matrix.to_ascii("resol.mat")

read_ascii(filename) staticmethod

read a PEST-compatible ASCII matrix/vector file

Parameters:

Name Type Description Default
filename `str`

file to read from

required

Returns:

Type Description

tuple containing

  • numpy.ndarray: numeric values
  • ['str']: list of row names
  • [str]: list of column names
  • bool: diagonal flag

read_binary(filename, forgive=False) staticmethod

static method to read PEST-format binary files

Parameters:

Name Type Description Default
filename `str`

filename to read

required
forgive `bool`

flag to forgive incomplete data records. Only applicable to dense binary format. Default is False

False

Returns:

Type Description

tuple containing

  • numpy.ndarray: the numeric values in the file
  • ['str']: list of row names
  • [str]: list of col_names

read_binary_header(filename) staticmethod

read the first elements of a PEST(++)-style binary file to get format and dimensioning information.

Parameters:

Name Type Description Default
filename `str`

the filename of the binary file

required

Returns:

Type Description

tuple containing

  • int: the itemp1 value
  • int: the itemp2 value
  • int: the icount value

read_dense(filename, forgive=False, close=True, only_rows=None) staticmethod

read a dense-format binary file.

Parameters:

Name Type Description Default
filename `str`

the filename or the open filehandle

required
forgive `bool`

flag to forgive incomplete records. If True and an incomplete record is encountered, only the previously read records are returned. If False, an exception is raised for an incomplete record

False
close `bool`

flag to close the filehandle. Default is True

True
only_rows `iterable`

rows to read. If None, all rows are read

None

Returns:

Type Description

tuple containing

  • numpy.ndarray: the numeric values in the file
  • ['str']: list of row names
  • [str]: list of col_names

reset_x(x, copy=True)

reset self.__x private attribute

Parameters:

Name Type Description Default
x `numpy.ndarray`

the new numeric data

required
copy `bool`

flag to make a copy of 'x'. Default is True

True

Returns:

Type Description

None

Note

operates in place

to_2d()

get a 2D Matrix representation of Matrix. If not Matrix.isdiagonal, simply return a copy of Matrix, otherwise, constructs and returns a new Matrix instance that is stored as diagonal

Returns:

Type Description

Martrix: non-diagonal form of Matrix

Example::

# A diagonal cov
cov = pyemu.Cov.from_parameter_data
cov2d = cov.as_2d # a numpy 2d array
print(cov.shape,cov.x.shape,cov2d.shape,cov2d.x.shape)

to_ascii(filename, icode=2)

write a PEST-compatible ASCII Matrix/vector file

Parameters:

Name Type Description Default
filename `str`

filename to write to

required
icode `int`

PEST-style info code for matrix style. Default is 2.

2

Note: if icode == -1, a 1-d vector is written that represents a diagonal matrix. An exception is raised if icode == -1 and isdiagonal is False

to_binary(filename, droptol=None, chunk=None)

write a PEST-compatible binary file. The format is the same as the format used to storage a PEST Jacobian matrix

Parameters:

Name Type Description Default
filename `str`

filename to save binary file

required
droptol `float`

absolute value tolerance to make values smaller droptol than zero. Default is None (no dropping)

None
chunk `int`

number of elements to write in a single pass. Default is None, which writes the entire numeric part of the Matrix at once. This is faster but requires more memory.

None

to_coo(filename, droptol=None, chunk=None)

write an extended PEST-format binary file. The data format is [int,int,float] for i,j,value. It is autodetected during the read with Matrix.from_binary().

Parameters:

Name Type Description Default
filename `str` or `Path`

filename to save binary file

required
droptol `float`

absolute value tolerance to make values smaller droptol than zero. Default is None (no dropping)

None
chunk `int`

number of elements to write in a single pass. Default is None, which writes the entire numeric part of the Matrix at once. This is faster but requires more memory.

None
Note

This method is needed when the number of dimensions times 2 is larger than the max value for a 32-bit integer. happens! This method is used by pyemu.Ensemble.to_binary()

to_dataframe()

return a pandas.DataFrame representation of Matrix

Returns:

Type Description

pandas.DataFrame: a dataframe derived from Matrix

Note

if self.isdiagonal is True, the full matrix is used to fill the dataframe - lots of zeros.

to_dense(filename, close=True)

experimental new dense matrix storage format to support faster I/O with ensembles

Parameters:

Name Type Description Default
filename `str`

the filename to save to

required
close `bool`

flag to close the filehandle after saving

True

Returns:

Name Type Description
f `file`

the file handle. Only returned if close is False

Note

calls Matrix.write_dense()

write_dense(filename, row_names, col_names, data, close=False) staticmethod

experimental new dense matrix storage format to support faster I/O with ensembles

Parameters:

Name Type Description Default
filename `str` or `file`

the file to write to

required
row_names [`str`]

row names of the matrix

required
col_names [`str`]

col names of the matrix

required
data `np.ndarray`

matrix elements

required
close `bool`

flag to close the file after writing

False

Returns:

Name Type Description
f `file`

the file handle. Only returned if close is False

ObservationEnsemble

Bases: Ensemble

Observation noise ensemble in the PEST(++) realm

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance

required
df `pandas.DataFrame`

a pandas dataframe. Columns should be observation names. Index is treated as realization names

required
istransformed `bool`

flag to indicate parameter values are in log space. Not used for ObservationEnsemble

False

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)

istransformed property

the parameter transformation status

Returns:

Type Description

bool: flag to indicate whether or not the ParameterEnsemble is

transformed with respect to log_{10}. Not used for (and has no effect

on) ObservationEnsemble.

Note

parameter transformation status is only related to log_{10} and does not include the effects of scale and/or offset

nonzero property

get a new ObservationEnsemble of just non-zero weighted observations

Returns:

Type Description

ObservationEnsemble: non-zero weighted observation ensemble.

Note

The pst attribute of the returned ObservationEnsemble also only includes non-zero weighted observations (and is therefore not valid for running with PEST or PEST++)

phi_vector property

vector of L2 norm (phi) for the realizations (rows) of Ensemble.

Returns:

Type Description

pandas.Series: series of realization name (Ensemble.index) and phi values

Note

The ObservationEnsemble.pst.weights can be updated prior to calling this method to evaluate new weighting strategies

pst = pst instance-attribute

pyemu.Pst: control file instance

add_base()

add the control file obsval values as a realization

Note

replaces the last realization with the current ObservationEnsemble.pst.observation_data.obsval values as a new realization named "base"

the PEST++ enemble tools will add this realization also if you dont wanna fool with it here...

as_pyemu_matrix(typ=None)

get a pyemu.Matrix instance of Ensemble

Parameters:

Name Type Description Default
typ `pyemu.Matrix` or `pyemu.Cov`

the type of matrix to return. Default is pyemu.Matrix

None

Returns:

Type Description

pyemu.Matrix: a matrix instance

Example::

oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst=pst,num_reals=100)
dev_mat = oe.get_deviations().as_pyemu_matrix(typ=pyemu.Cov)
obscov = dev_mat.T * dev_mat

back_transform()

back transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

Ensemble.back_transform() is only provided for inheritance purposes. It only changes theEnsemble._transformed` flag

copy()

get a copy of Ensemble

Returns:

Type Description

Ensemble: copy of this Ensemble

Note

copies both Ensemble.pst and Ensemble._df: can be expensive

covariance_matrix(localizer=None, center_on=None)

get a empirical covariance matrix implied by the correlations between realizations

Parameters:

Name Type Description Default
localizer `pyemu.Matrix`

a matrix to localize covariates in the resulting covariance matrix. Default is None

None
center_on `str`

a realization name to use as the centering point in ensemble space. If None, the mean vector is treated as the centering point. Default is None

None

Returns:

Type Description

pyemu.Cov: the empirical (and optionally localized) covariance matrix

draw_new_ensemble(num_reals, include_noise=True, noise_reals=None, rng=None)

Draw a new (potentially larger) ObservationEnsemble instance using the realizations in self.

Parameters:

Name Type Description Default
num_reals (int)

number of realizations to generate

required
include_noise varies

a bool or a float the describes the standard deviation of noise to add to the new realizations. This is to help with the issue of under-varied new realizations resulting from npar >> nreals in self. If True, The standard devation is set to one over the square root on number of reals in self.

True
noise_reals ObservationEnsemble

other existing realizations (likely prior realizations) that are used as noise realizations in place of IID noise that is used if include_noise is True and noise_reals is None.

None
rng RandomState

random number generator if not using default from pyemu.en

None

Returns ObservationEnsemble

Note

any zero weighted observations in self are omitted in the returned ObservationEnsemble

dropna(*args, **kwargs)

override of pandas.DataFrame.dropna()

Parameters:

Name Type Description Default
*args ([`object`]

positional arguments to pass to pandas.DataFrame.dropna().

required
**kwargs ({`str`

object}): keyword arguments to pass to pandas.DataFrame.dropna().

required

from_binary(pst, filename) classmethod

create an Ensemble from a PEST-style binary file

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance

required
filename `str`

filename containing binary ensemble

required

Returns:

Type Description

Ensemble: the ensemble loaded from the binary file

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_binary("obs.jcb")

from_csv(pst, filename, *args, **kwargs) classmethod

create an Ensemble from a CSV file

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance

required
filename `str`

filename containing CSV ensemble

required
*args ([`object`]

positional arguments to pass to pandas.read_csv().

required
**kwargs ({`str`

object}): keyword arguments to pass to pandas.read_csv().

required

Returns: Ensemble Note: uses pandas.read_csv() to load numeric values from CSV file

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_csv("obs.csv")

from_gaussian_draw(pst, cov=None, num_reals=100, by_groups=True, fill=False, factor='cholesky', rng=None) classmethod

generate an ObservationEnsemble from a (multivariate) gaussian distribution

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance.

required
cov `pyemu.Cov`

a covariance matrix describing the second moment of the gaussian distribution. If None, cov is generated from the non-zero-weighted observation weights in pst. Only observations listed in cov are sampled. Other observations are assigned the obsval value from pst.

None
num_reals `int`

number of stochastic realizations to generate. Default is 100

100
by_groups `bool`

flag to generate realzations be observation group. This assumes no correlation (covariates) between observation groups.

True
fill `bool`

flag to fill in zero-weighted observations with control file values. Default is False.

False
factor `str`

how to factorize cov to form the projection matrix. Can be "eigen", "svd", or "cholesky. The "cholesky" option is default and is faster. But for (nearly) singular cov matrices (such as those generated empirically from ensembles), "svd" and/or "eigen" might be required. Ignored for diagonal cov.

'cholesky'
rng `numpy.random.RandomState`

random number generator if not using default from pyemu.en

None

Returns:

Type Description

ObservationEnsemble: the realized ObservationEnsemble instance

Note

Only observations named in cov are sampled. Additional, cov is processed prior to sampling to only include non-zero-weighted observations depending on the value of fill. So users must take care to make sure observations have been assigned non-zero weights even if cov is being passed

The default cov is generated from pyemu.Cov.from_observation_data, which assumes observation noise standard deviations are the inverse of the weights listed in pst

Example::

pst = pyemu.Pst("my.pst")
# the easiest way - just relying on weights in pst
oe1 = pyemu.ObservationEnsemble.from_gaussian_draw(pst)

# generate the cov explicitly
cov = pyemu.Cov.from_observation_data(pst)
oe2 = pyemu.ObservationEnsemble.from_gaussian_draw(pst,cov=cov)

# give all but one observation zero weight.  This will
# result in an oe with only one randomly sampled observation noise
# vector since the cov is processed to remove any zero-weighted
# observations before sampling
pst.observation_data.loc[pst.nnz_obs_names[1:],"weight] = 0.0
oe3 = pyemu.ObservationEnsemble.from_gaussian_draw(pst,cov=cov)

get_deviations(center_on=None)

get the deviations of the realizations around a certain point in ensemble space

Parameters:

Name Type Description Default
center_on `str`

a realization name to use as the centering point in ensemble space. If None, the mean vector is treated as the centering point. Default is None

None

Returns:

Type Description

Ensemble: an ensemble of deviations around the centering point

Note

deviations are the Euclidean distances from the center_on value to realized values for each column

center_on=None yields the classic ensemble smoother/ensemble Kalman filter deviations from the mean vector

Deviations respect log-transformation status.

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.add_base()
oe_dev = oe.get_deviations(center_on="base")
oe.to_csv("obs_base_devs.csv")

plot(bins=10, facecolor='0.5', plot_cols=None, filename='ensemble.pdf', func_dict=None, **kwargs)

plot ensemble histograms to multipage pdf

Parameters:

Name Type Description Default
bins `int`

number of bins for the histograms

10
facecolor `str`

matplotlib color (e.g. r,g, etc)

'0.5'
plot_cols [`str`]

list of subset of ensemble columns to plot. If None, all are plotted. Default is None

None
filename `str`

multipage pdf filename. Default is "ensemble.pdf"

'ensemble.pdf'
func_dict `dict`

a dict of functions to apply to specific columns. For example: {"par1": np.log10}

None
**kwargs `dict`

addkeyword args to pass to pyemu.plot_utils.ensemble_helper()

{}

Example::

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)
pe.transform() # plot log space (if needed)
pe.plot(bins=30)

reseed() staticmethod

reset the pyemu.en.rng local random generator

Note

reseeds using the pyemu.en.SEED global variable

The pyemu.en.SEED value is used to initialize the rng on import, so make sure you know what you are doing if you call this method...

to_binary(filename)

write Ensemble to a PEST-style binary file

Parameters:

Name Type Description Default
filename `str` or `Path`

file to write

required

Returns:

Type Description

str: the filename written (may be modified from input)

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_binary("obs.jcb")
Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

to_csv(filename, *args, **kwargs)

write Ensemble to a CSV file

Parameters:

Name Type Description Default
filename `str`

file to write

required
*args ([`object`]

positional arguments to pass to pandas.DataFrame.to_csv().

required
**kwargs ({`str`

object}): keyword arguments to pass to pandas.DataFrame.to_csv().

required

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_csv("obs.csv")
Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

to_dense(filename)

write Ensemble to a dense-format binary file

Parameters:

Name Type Description Default
filename `str` or `Path`

file to write

required

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_dense("obs.bin")
Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

transform()

transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

Ensemble.transform() is only provided for inheritance purposes. It only changes theEnsemble._transformed` flag

ParameterEnsemble

Bases: Ensemble

Parameter ensembles in the PEST(++) realm

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance

required
df `pandas.DataFrame`

a pandas dataframe. Columns should be parameter names. Index is treated as realization names

required
istransformed `bool`

flag to indicate parameter values are in log space (if partrans is "log" in pst)

False

Example::

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)

adj_names property

the names of adjustable parameters in ParameterEnsemble

Returns:

Type Description

[str]: adjustable parameter names

fixed_indexer property

boolean indexer for non-adjustable parameters

Returns:

Type Description

numpy.ndarray(bool): boolean array indicating which parameters have

partrans equal to "log" or "fixed"

istransformed property

the parameter transformation status

Returns:

Type Description

bool: flag to indicate whether or not the ParameterEnsemble is

transformed with respect to log_{10}. Not used for (and has no effect

on) ObservationEnsemble.

Note

parameter transformation status is only related to log_{10} and does not include the effects of scale and/or offset

lbnd property

the lower bound vector while respecting current log transform status

Returns:

Type Description

pandas.Series: (log-transformed) lower parameter bounds listed in

ParameterEnsemble.pst.parameter_data.parlbnd

log_indexer property

boolean indexer for log transform

Returns:

Type Description

numpy.ndarray(bool): boolean array indicating which parameters are log

transformed

pst = pst instance-attribute

pyemu.Pst: control file instance

ubnd property

the upper bound vector while respecting current log transform status

Returns:

Type Description

pandas.Series: (log-transformed) upper parameter bounds listed in

ParameterEnsemble.pst.parameter_data.parubnd

add_base()

add the control file obsval values as a realization

Note

replaces the last realization with the current ParameterEnsemble.pst.parameter_data.parval1 values as a new realization named "base"

The PEST++ ensemble tools will add this realization also if you dont wanna fool with it here...

as_pyemu_matrix(typ=None)

get a pyemu.Matrix instance of Ensemble

Parameters:

Name Type Description Default
typ `pyemu.Matrix` or `pyemu.Cov`

the type of matrix to return. Default is pyemu.Matrix

None

Returns:

Type Description

pyemu.Matrix: a matrix instance

Example::

oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst=pst,num_reals=100)
dev_mat = oe.get_deviations().as_pyemu_matrix(typ=pyemu.Cov)
obscov = dev_mat.T * dev_mat

back_transform()

back transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

copy()

get a copy of Ensemble

Returns:

Type Description

Ensemble: copy of this Ensemble

Note

copies both Ensemble.pst and Ensemble._df: can be expensive

covariance_matrix(localizer=None, center_on=None)

get a empirical covariance matrix implied by the correlations between realizations

Parameters:

Name Type Description Default
localizer `pyemu.Matrix`

a matrix to localize covariates in the resulting covariance matrix. Default is None

None
center_on `str`

a realization name to use as the centering point in ensemble space. If None, the mean vector is treated as the centering point. Default is None

None

Returns:

Type Description

pyemu.Cov: the empirical (and optionally localized) covariance matrix

draw_new_ensemble(num_reals, include_noise=True, noise_reals=None, rng=None)

Draw a new (potentially larger) ParameterEnsemble instance using the realizations in self.

Parameters:

Name Type Description Default
num_reals (int)

number of realizations to generate

required
include_noise varies

a bool or a float the describes the standard deviation of noise to add to the new realizations. This is to help with the issue of under-varied new realizations resulting from npar >> nreals in self. If True, The standard devation is set to one over the square root on number of reals in self.

True
noise_reals ParameterEnsemble

other existing realizations (likely prior realizations) that are used as noise realizations in place of IID noise that is used if include_noise is True and noise_reals is None.

None
rng `numpy.random.RandomState`

random number generator if not using default from pyemu.en

None

Returns ParameterEnsemble

Note

any fixed and/or tied parameters in self are omitted in the returned ParameterEnsemble

dropna(*args, **kwargs)

override of pandas.DataFrame.dropna()

Parameters:

Name Type Description Default
*args ([`object`]

positional arguments to pass to pandas.DataFrame.dropna().

required
**kwargs ({`str`

object}): keyword arguments to pass to pandas.DataFrame.dropna().

required

enforce(how='reset', bound_tol=0.0)

entry point for bounds enforcement.

Parameters:

Name Type Description Default
enforce_bounds `str`

can be 'reset' to reset offending values or 'drop' to drop offending realizations. Default is "reset"

required
Note

In very high dimensions, the "drop" and "scale" how types will result in either very few realizations or very short realizations.

Example::

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw()
pe.enforce(how="scale")
pe.to_csv("par.csv")

from_binary(pst, filename) classmethod

create an Ensemble from a PEST-style binary file

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance

required
filename `str`

filename containing binary ensemble

required

Returns:

Type Description

Ensemble: the ensemble loaded from the binary file

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_binary("obs.jcb")

from_csv(pst, filename, *args, **kwargs) classmethod

create an Ensemble from a CSV file

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance

required
filename `str`

filename containing CSV ensemble

required
*args ([`object`]

positional arguments to pass to pandas.read_csv().

required
**kwargs ({`str`

object}): keyword arguments to pass to pandas.read_csv().

required

Returns: Ensemble Note: uses pandas.read_csv() to load numeric values from CSV file

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_csv("obs.csv")

from_gaussian_draw(pst, cov=None, num_reals=100, by_groups=True, fill=True, factor='cholesky', rng=None) classmethod

generate a ParameterEnsemble from a (multivariate) (log) gaussian distribution

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance.

required
cov `pyemu.Cov`

a covariance matrix describing the second moment of the gaussian distribution. If None, cov is generated from the bounds of the adjustable parameters in pst. the (log) width of the bounds is assumed to represent a multiple of the parameter standard deviation (this is the sigma_range argument that can be passed to pyemu.Cov.from_parameter_data).

None
num_reals `int`

number of stochastic realizations to generate. Default is 100

100
by_groups `bool`

flag to generate realizations be parameter group. This assumes no correlation (covariates) between parameter groups. For large numbers of parameters, this help prevent memories but is slower.

True
fill `bool`

flag to fill in fixed and/or tied parameters with control file values. Default is True.

True
factor `str`

how to factorize cov to form the projection matrix. Can be "eigen", "svd", or "cholesky". The "cholesky" option is default and is faster. But for (nearly) singular cov matrices (such as those generated empirically from ensembles), "svd" and/or "eigen" might be required. Ignored for diagonal cov.

'cholesky'
rng `numpy.random.RandomState`

random number generator if not using default from pyemu.en

None

Returns:

Type Description

ParameterEnsemble: the parameter ensemble realized from the gaussian

distribution

Note:

Only parameters named in `cov` are sampled. Missing parameters are assigned values of
`pst.parameter_data.parval1` along the corresponding columns of `ParameterEnsemble`
according to the value of `fill`.

The default `cov` is generated from `pyemu.Cov.from_observation_data`, which assumes
parameter bounds in `ParameterEnsemble.pst` represent some multiple of parameter
standard deviations.  Additionally, the default Cov only includes adjustable
parameters (`partrans` not "tied" or "fixed").

"tied" parameters are not sampled.

Example::

pst = pyemu.Pst("my.pst")
# the easiest way - just relying on weights in pst
pe1 = pyemu.ParameterEnsemble.from_gaussian_draw(pst)

# generate the cov explicitly with a sigma_range
cov = pyemu.Cov.from_parameter_data(pst,sigma_range=6)
[e2 = pyemu.ParameterEnsemble.from_gaussian_draw(pst,cov=cov)

from_mixed_draws(pst, how_dict, default='gaussian', num_reals=100, cov=None, sigma_range=6, enforce_bounds=True, partial=False, fill=True, rng=None) classmethod

generate a ParameterEnsemble using a mixture of distributions. Available distributions include (log) "uniform", (log) "triangular", and (log) "gaussian". log transformation is respected.

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file

required
how_dict `dict`

a dictionary of parameter name keys and "how" values, where "how" can be "uniform","triangular", or "gaussian".

required
default `str`

the default distribution to use for parameter not listed in how_dict. Default is "gaussian".

'gaussian'
num_reals `int`

number of realizations to draw. Default is 100.

100
cov `pyemu.Cov`

an optional Cov instance to use for drawing from gaussian distribution. If None, and "gaussian" is listed in how_dict (and/or default), then a diagonal covariance matrix is constructed from the parameter bounds in pst (with sigma_range). Default is None.

None
sigma_range `float`

the number of standard deviations implied by the parameter bounds in the pst. Only used if "gaussian" is in how_dict (and/or default) and cov is None. Default is 6.

6
enforce_bounds `bool`

flag to enforce parameter bounds in resulting ParameterEnsemble. Only matters if "gaussian" is in values of how_dict. Default is True.

True
partial `bool`

flag to allow a partial ensemble (not all pars included). If True, parameters not name in how_dict will be sampled using the distribution named as default. Default is False.

False
fill `bool`

flag to fill in fixed and/or tied parameters with control file values. Default is True.

True
rng `numpy.random.RandomState`

random number generator if not using default from pyemu.en

None

Example::

pst = pyemu.Pst("pest.pst")
# uniform for the fist 10 pars
how_dict = {p:"uniform" for p in pst.adj_par_names[:10]}
pe = pyemu.ParameterEnsemble(pst,how_dict=how_dict)
pe.to_csv("my_mixed_pe.csv")

from_parfiles(pst, parfile_names, real_names=None) classmethod

create a parameter ensemble from PEST-style parameter value files. Accepts parfiles with less than the parameters in the control (get NaNs in the ensemble) or extra parameters in the parfiles (get dropped)

Parameters:

Name Type Description Default
pst `pyemu.Pst`

control file instance

required
parfile_names `[str`]

par file names

required
real_names `str`

optional list of realization names. If None, a single integer counter is used

None

Returns:

Type Description

ParameterEnsemble: parameter ensemble loaded from par files

from_triangular_draw(pst, num_reals=100, fill=True, rng=None) classmethod

generate a ParameterEnsemble from a (multivariate) (log) triangular distribution

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance

required
num_reals `int`

number of realizations to generate. Default is 100

100
fill `bool`

flag to fill in fixed and/or tied parameters with control file values. Default is True.

True
rng `numpy.random.RandomState`

random number generator if not using default from pyemu.en

None

Returns:

Type Description

ParameterEnsemble: a parameter ensemble drawn from the multivariate (log) triangular

distribution defined by the parameter upper and lower bounds and initial parameter

values in pst

Note

respects transformation status in pst: fixed and tied parameters are not realized, log-transformed parameters are drawn in log space. The returned ParameterEnsemble is back transformed (not in log space)

uses pyemu.en.rng.triangular

Example::

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_triangular_draw(pst)
pe.to_csv("my_tri_pe.csv")

from_uniform_draw(pst, num_reals, fill=True, rng=None) classmethod

generate a ParameterEnsemble from a (multivariate) (log) uniform distribution

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file instance

required
num_reals `int`

number of realizations to generate. Default is 100

required
fill `bool`

flag to fill in fixed and/or tied parameters with control file values. Default is True.

True
rng `numpy.random.RandomState`

random number generator if not using default from pyemu.en

None

Returns:

Type Description

ParameterEnsemble: a parameter ensemble drawn from the multivariate (log) uniform

distribution defined by the parameter upper and lower bounds pst

Note

respects transformation status in pst: fixed and tied parameters are not realized, log-transformed parameters are drawn in log space. The returned ParameterEnsemble is back transformed (not in log space)

uses pyemu.en.rng.uniform

Example::

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_uniform_draw(pst)
pe.to_csv("my_uni_pe.csv")

get_deviations(center_on=None)

get the deviations of the realizations around a certain point in ensemble space

Parameters:

Name Type Description Default
center_on `str`

a realization name to use as the centering point in ensemble space. If None, the mean vector is treated as the centering point. Default is None

None

Returns:

Type Description

Ensemble: an ensemble of deviations around the centering point

Note

deviations are the Euclidean distances from the center_on value to realized values for each column

center_on=None yields the classic ensemble smoother/ensemble Kalman filter deviations from the mean vector

Deviations respect log-transformation status.

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.add_base()
oe_dev = oe.get_deviations(center_on="base")
oe.to_csv("obs_base_devs.csv")

plot(bins=10, facecolor='0.5', plot_cols=None, filename='ensemble.pdf', func_dict=None, **kwargs)

plot ensemble histograms to multipage pdf

Parameters:

Name Type Description Default
bins `int`

number of bins for the histograms

10
facecolor `str`

matplotlib color (e.g. r,g, etc)

'0.5'
plot_cols [`str`]

list of subset of ensemble columns to plot. If None, all are plotted. Default is None

None
filename `str`

multipage pdf filename. Default is "ensemble.pdf"

'ensemble.pdf'
func_dict `dict`

a dict of functions to apply to specific columns. For example: {"par1": np.log10}

None
**kwargs `dict`

addkeyword args to pass to pyemu.plot_utils.ensemble_helper()

{}

Example::

pst = pyemu.Pst("my.pst")
pe = pyemu.ParameterEnsemble.from_gaussian_draw(pst)
pe.transform() # plot log space (if needed)
pe.plot(bins=30)

project(projection_matrix, center_on=None, log=None, enforce_bounds='reset')

project the ensemble using the null-space Monte Carlo method

Parameters:

Name Type Description Default
projection_matrix `pyemu.Matrix`

null-space projection operator.

required
center_on `str`

the name of the realization to use as the centering point for the null-space differening operation. If center_on is None, the ParameterEnsemble mean vector is used. Default is None

None
log `pyemu.Logger`

for logging progress

None
enforce_bounds `str`

parameter bound enforcement option to pass to ParameterEnsemble.enforce(). Valid options are reset, drop, scale or None. Default is reset.

'reset'

Returns:

Type Description

ParameterEnsemble: untransformed, null-space projected ensemble.

Example::

ev = pyemu.ErrVar(jco="my.jco") #assumes my.pst exists
pe = pyemu.ParameterEnsemble.from_gaussian_draw(ev.pst)
pe_proj = pe.project(ev.get_null_proj(maxsing=25))
pe_proj.to_csv("proj_par.csv")

reseed() staticmethod

reset the pyemu.en.rng local random generator

Note

reseeds using the pyemu.en.SEED global variable

The pyemu.en.SEED value is used to initialize the rng on import, so make sure you know what you are doing if you call this method...

to_binary(filename)

write Ensemble to a PEST-style binary file

Parameters:

Name Type Description Default
filename `str` or `Path`

file to write

required

Returns:

Type Description

str: the filename written (may be modified from input)

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_binary("obs.jcb")
Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

to_csv(filename, *args, **kwargs)

write Ensemble to a CSV file

Parameters:

Name Type Description Default
filename `str`

file to write

required
*args ([`object`]

positional arguments to pass to pandas.DataFrame.to_csv().

required
**kwargs ({`str`

object}): keyword arguments to pass to pandas.DataFrame.to_csv().

required

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_csv("obs.csv")
Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

to_dense(filename)

write Ensemble to a dense-format binary file

Parameters:

Name Type Description Default
filename `str` or `Path`

file to write

required

Example::

pst = pyemu.Pst("my.pst")
oe = pyemu.ObservationEnsemble.from_gaussian_draw(pst)
oe.to_dense("obs.bin")
Note

back transforms ParameterEnsemble before writing so that values are in arithmetic space

transform()

transform parameters with respect to partrans value.

Note

operates in place (None is returned).

Parameter transform is only related to log_{10} and does not include the effects of scale and/or offset

Pst

Bases: object

All things PEST(++) control file

Parameters:

Name Type Description Default
filename `str`

the name of the control file

required
load `bool`

flag to load the control file. Default is True

True
resfile `str`

corresponding residual file. If None, a residual file with the control file base name is sought. Default is None

None
Note

This class is the primary mechanism for dealing with PEST control files. Support is provided for constructing new control files as well as manipulating existing control files.

Example::

pst = pyemu.Pst("my.pst")
pst.control_data.noptmax = -1
pst.write("my_new.pst")

adj_par_groups property

get the parameter groups with at least one adjustable parameter

Returns:

Type Description

[str]: a list of parameter groups with

at least one adjustable parameter

adj_par_names property

get the adjustable (not fixed or tied) parameter names

Returns:

Type Description

[str]: list of adjustable (not fixed or tied)

parameter names

control_data = ControlData() instance-attribute

pyemu.pst.pst_controldata.ControlData: '* control data' information.
Access with standard PEST variable names

Example::

pst.control_data.noptmax = 2
pst.control_data.pestmode = "estimation"

estimation property

check if the control_data.pestmode is set to estimation

Returns:

Type Description

bool: True if control_data.pestmode is estmation, False otherwise

forecast_names property

get the forecast names from the pestpp options (if any). Returns None if no forecasts are named

Returns:

Type Description

[str]: a list of forecast names.

greater_than_obs_constraints property

get the names of the observations that are listed as active (non-zero weight) greater than inequality constraints.

Returns:

Type Description

pandas.Series: names observations that are non-zero weighted

greater than constraints (obgnme startsiwth "g_" or "greater")

Note

Zero-weighted obs are skipped

greater_than_pi_constraints property

get the names of the prior information eqs that are listed as active (non-zero weight) greater than inequality constraints.

Returns:

Type Description

pandas.Series names of prior information that are non-zero weighted

greater than constraints (obgnme startsiwth "g_" or "greater")

Note

Zero-weighted pi are skipped

input_files property

list of model input file names

Returns:

Type Description

[str]: a list of model input file names, extracted from Pst.model_input_data.model_file. Returns None if this attribute is None

Note

Use Pst.model_input_data to access the template-input file information for writing/modification

instruction_files property

list of instruction file names Returns: [str]: a list of instruction file names, extracted from Pst.model_output_data.pest_file. Returns None if this attribute is None

Note

Use Pst.model_output_data to access the instruction-output file information for writing/modification

less_than_obs_constraints property

get the names of the observations that are listed as active (non-zero weight) less than inequality constraints.

Returns:

Type Description

pandas.Series: names of observations that are non-zero weighted less

than constraints (obgnme starts with 'l_' or "less")

Note

Zero-weighted obs are skipped

less_than_pi_constraints property

get the names of the prior information eqs that are listed as active (non-zero weight) less than inequality constraints.

Returns:

Type Description

pandas.Series: names of prior information that are non-zero weighted

less than constraints (obgnme starts with "l_" or "less")

Note

Zero-weighted pi are skipped

nnz_obs property

get the number of non-zero weighted observations

Returns:

Type Description

int: the number of non-zeros weighted observations

nnz_obs_groups property

get the observation groups that contain at least one non-zero weighted observation

Returns:

Type Description

[str]: a list of observation groups that contain at

least one non-zero weighted observation

nnz_obs_names property

get the non-zero weight observation names

Returns:

Type Description

[str]: a list of non-zero weighted observation names

nobs property

get the number of observations

Returns:

Type Description

int: the number of observations

npar property

get number of parameters

Returns:

Type Description

int: the number of parameters

npar_adj property

get the number of adjustable parameters (not fixed or tied)

Returns:

Type Description

int: the number of adjustable parameters

nprior property

number of prior information equations

Returns:

Type Description

int: the number of prior info equations

obs_groups property

get the observation groups

Returns:

Type Description

[str]: a list of unique observation groups

obs_names property

get the observation names

Returns:

Type Description

[str]: a list of observation names

observation_data = None instance-attribute

pandas.DataFrame: '* observation data' information. Columns are standard PEST variable names

Example::

pst.observation_data.loc[:,"weight"] = 1.0
pst.observation_data.loc[:,"obgnme"] = "obs_group"

output_files property

list of model output file names Returns: [str]: a list of model output file names, extracted from Pst.model_output_data.model_file. Returns None if this attribute is None

Note

Use Pst.model_output_data to access the instruction-output file information for writing/modification

par_groups property

get the parameter groups

Returns:

Type Description

[str]: a list of parameter groups

par_names property

get the parameter names

Returns:

Type Description

[str]: a list of parameter names

parameter_data = None instance-attribute

pandas.DataFrame: '* parameter data' information. Columns are standard PEST variable names

Example::

pst.parameter_data.loc[:,"partrans"] = "log"
pst.parameter_data.loc[:,"parubnd"] = 10.0

phi property

get the weighted total objective function.

Returns:

Type Description

float: sum of squared residuals

Note

Requires Pst.res (the residuals file) to be available

phi_components property

get the individual components of the total objective function

Returns:

Type Description

dict: dictionary of observation group, contribution to total phi

Note

Requires Pst.res (the residuals file) to be available

phi_components_normalized property

get the individual components of the total objective function normalized to the total PHI being 1.0

Returns:

Type Description

dict: dictionary of observation group,

normalized contribution to total phi

Note

Requires Pst.res (the residuals file) to be available

prior_groups property

get the prior info groups

Returns:

Type Description

[str]: a list of prior information groups

prior_information = None instance-attribute

pandas.DataFrame: '* prior information' data. Columns are standard PEST variable names

prior_names property

get the prior information names

Returns:

Type Description

[str]: a list of prior information names

reg_data = RegData() instance-attribute

pyemu.pst.pst_controldata.RegData: '* regularization' section information. Access with standard PEST variable names.

Example::

pst.reg_data.phimlim = 1.00 #yeah right!

res property

get the residuals dataframe attribute

Returns:

Type Description

pandas.DataFrame: a dataframe containing the

residuals information.

Note

if the Pst.__res attribute has not been loaded, this call loads the res dataframe from a file

Example::

# print the observed and simulated values for non-zero weighted obs
print(pst.res.loc[pst.nnz_obs_names,["modelled","measured"]])

svd_data = SvdData() instance-attribute

pyemu.pst.pst_controldata.SvdData: '* singular value decomposition' section information.
Access with standard PEST variable names

Example::

pst.svd_data.maxsing = 100

template_files property

list of template file names

Returns:

Type Description

[str]: a list of template file names, extracted from Pst.model_input_data.pest_file. Returns None if this attribute is None

Note

Use Pst.model_input_data to access the template-input file information for writing/modification

tied property

list of tied parameter names

Returns:

Type Description

pandas.DataFrame: a dataframe of tied parameter information.

Columns of tied are parnme and partied. Returns None if

no tied parameters are found.

zero_weight_obs_names property

get the zero-weighted observation names

Returns:

Type Description

[str]: a list of zero-weighted observation names

__reset_weights(target_phis, res_idxs, obs_idxs)

private method to reset weights based on target phi values for each group. This method should not be called directly

Parameters:

Name Type Description Default
target_phis `dict`

target phi contribution for groups to reweight

required
res_idxs `dict`

the index positions of each group of interest in the res dataframe

required
obs_idxs `dict`

the index positions of each group of interest in the observation data dataframe

required

add_observations(ins_file, out_file=None, pst_path=None, inschek=True)

add new observations to a control file

Parameters:

Name Type Description Default
ins_file `str`

instruction file with exclusively new observation names

required
out_file `str`

model output file. If None, then ins_file.replace(".ins","") is used. Default is None

None
pst_path `str`

the path to append to the instruction file and out file in the control file. If not None, then any existing path in front of the template or in file is split off and pst_path is prepended. If python is being run in a directory other than where the control file will reside, it is useful to pass pst_path as .. Default is None

None
inschek `bool`

flag to try to process the existing output file using the pyemu.InstructionFile class. If successful, processed outputs are used as obsvals

True

Returns:

Type Description

pandas.DataFrame: the data for the new observations that were added

Note

populates the new observation information with default values

Example::

pst = pyemu.Pst(os.path.join("template", "my.pst"))
pst.add_observations(os.path.join("template","new_obs.dat.ins"), pst_path=".")
pst.write(os.path.join("template", "my_new.pst")

add_parameters(template_file, in_file=None, pst_path=None)

add new parameters to an existing control file

Parameters:

Name Type Description Default
template_file `str`

template file with (possibly) some new parameters

required
in_file `str`

model input file. If None, template_file.replace('.tpl','') is used. Default is None.

None
pst_path `str`

the path to append to the template_file and in_file in the control file. If not None, then any existing path in front of the template or in file is split off and pst_path is prepended. If python is being run in a directory other than where the control file will reside, it is useful to pass pst_path as .. Default is None

None

Returns:

Type Description

pandas.DataFrame: the data for the new parameters that were added.

If no new parameters are in the new template file, returns None

Note

populates the new parameter information with default values

Example::

pst = pyemu.Pst(os.path.join("template","my.pst"))
pst.add_parameters(os.path.join("template","new_pars.dat.tpl",pst_path=".")
pst.write(os.path.join("template","my_new.pst")

add_pars_as_obs(pst_path='.', par_sigma_range=4, name_prefix='')

add all parameter values as observation values by creating a new template and instruction file and adding them to the control file

Parameters:

Name Type Description Default
pst_path str

the path to the control file from where python is running. Default is "." (python is running in the same directory as the control file)

'.'
par_sigma_range int

number of standard deviations implied by the distance between the parameter bounds. Used to set the weights for the range observations

4
name_prefix str

a tag to prepend to the observation names and the cooresponding I/O files

''

Returns: DataFrame: info for the new observations

add_pi_equation(par_names, pilbl=None, rhs=0.0, weight=1.0, obs_group='pi_obgnme', coef_dict={})

a helper to construct a new prior information equation.

Parameters:

Name Type Description Default
par_names [`str`]

parameter names in the equation

required
pilbl `str`

name to assign the prior information equation. If None, a generic equation name is formed. Default is None

None
rhs `float`

the right-hand side of the pi equation

0.0
weight `float`

the weight of the equation

1.0
obs_group `str`

the observation group for the equation. Default is 'pi_obgnme'

'pi_obgnme'
coef_dict `dict`

a dictionary of parameter name, coefficient pairs to assign leading coefficients for one or more parameters in the equation. If a parameter is not listed, 1.0 is used for its coefficients. Default is {}

{}

Example::

pst = pyemu.Pst("pest.pst")
# add a pi equation for the first adjustable parameter
pst.add_pi_equation(pst.adj_par_names[0],pilbl="pi1",rhs=1.0)
# add a pi equation for 1.5 times the 2nd and 3 times the 3rd adj pars to sum together to 2.0
names = pst.adj_par_names[[1,2]]
pst.add_pi_equation(names,coef_dict={names[0]:1.5,names[1]:3})

add_transform_columns(include_offset_and_scale=False)

add transformed values to the Pst.parameter_data attribute

Parameters:

Name Type Description Default
include_offset_and_scale bool

flag to apply the scale and offset values before

False
Note

adds parval1_trans, parlbnd_trans and parubnd_trans to Pst.parameter_data

Example::

pst = pyemu.Pst("pest.pst")
pst.add_transform_columns()
print(pst.parameter_data.parval1_trans

adjust_weights(obs_dict=None, obsgrp_dict=None)

reset the weights of observations or observation groups to contribute a specified amount to the composite objective function

Parameters:

Name Type Description Default
obs_dict `dict`

dictionary of observation name,new contribution pairs

None
obsgrp_dict `dict`

dictionary of obs group name,contribution pairs

None
Notes

If a group is assigned a contribution of 0, all observations in that group will be assigned zero weight.

If a group is assigned a nonzero contribution AND all observations in that group start with zero weight, the observations will be assigned weight of 1.0 to allow balancing.

If groups obsgrp_dict is not passed, all nonzero

Example::

pst = pyemu.Pst("my.pst")

# adjust a single observation
pst.adjust_weights(obs_dict={"obs1":10})

# adjust a single observation group
pst.adjust_weights(obsgrp_dict={"group1":100.0})

# make all non-zero weighted groups have a contribution of 100.0
balanced_groups = {grp:100 for grp in pst.nnz_obs_groups}
pst.adjust_weights(obsgrp_dict=balanced_groups)

adjust_weights_discrepancy(resfile=None, original_ceiling=True, bygroups=False)

adjusts the weights of each non-zero weight observation based on the residual in the pest residual file so each observations contribution to phi is 1.0 (e.g. Mozorov's discrepancy principal)

Parameters:

Name Type Description Default
resfile `str`

residual file name. If None, try to use a residual file with the Pst case name. Default is None

None
original_ceiling `bool`

flag to keep weights from increasing - this is generally a good idea. Default is True

True
bygroups `bool`

flag to adjust weights by groups. If False, the weight of each non-zero weighted observation is adjusted individually. If True, intergroup weighting is preserved (the contribution to each group is used) but this may result in some strangeness if some observations in a group have a really low phi already.

False

Example::

pst = pyemu.Pst("my.pst")
print(pst.phi) #assumes "my.res" is colocated with "my.pst"
pst.adjust_weights_discrepancy()
print(pst.phi) # phi should equal number of non-zero observations

bounds_report(iterations=None)

report how many parameters are at bounds. If ensemble, the base enbsemble member is evaluated

Parameters:

Name Type Description Default
iterations [`int`]

a list of iterations for which a bounds report is requested If None, all iterations for which par files are located are reported. Default is None

None

Returns:

Type Description

df: a pandas DataFrame object with rows being parameter groups and columns _num_at_ub, _num_at_lb, and _total_at_bounds row 0 is total at bounds, subsequent rows correspond with groups

Example

pst = pyemu.Pst("my.pst") df = pst.bound_report(iterations=[0,2,3])

build_increments()

experimental method to calculate parameter increments for use in the finite difference perturbation calculations

Note

user beware!

calculate_perturbations()

experimental method to calculate finite difference parameter perturbations.

Note:

The perturbation values are added to the
`Pst.parameter_data` attribute - user beware!

dialate_par_bounds(dialate_factor, center=True)

increase the distance between the parameter bounds while respecting the log transformation status

Parameters:

Name Type Description Default
dialate_factor varies

a factor to increase the distance between parameter bounds. Can be a float or a dict of str-float pars.

required
center bool

flag to dialate from the center point between the bounds. If False, then the dialation is from the parval1 values. Beware that using center False can have produce the some strange results...

True

drop_observations(ins_file, pst_path=None)

remove observations in an instruction file from the control file

Parameters:

Name Type Description Default
ins_file `str`

instruction file to remove

required
pst_path `str`

the path to append to the instruction file in the control file. If not None, then any existing path in front of the instruction is split off and pst_path is prepended. If python is being run in a directory other than where the control file will reside, it is useful to pass pst_path as .. Default is None

None

Returns:

Type Description

pandas.DataFrame: the observation data for the observations that were removed.

Example::

pst = pyemu.Pst(os.path.join("template", "my.pst"))
pst.remove_observations(os.path.join("template","some_obs.dat.ins"), pst_path=".")
pst.write(os.path.join("template", "my_new_with_less_obs.pst")

drop_parameters(tpl_file, pst_path=None)

remove parameters in a template file from the control file

Parameters:

Name Type Description Default
tpl_file `str`

template file to remove

required
pst_path `str`

the path to append to the template file in the control file. If not None, then any existing path in front of the template or in file is split off and pst_path is prepended. If python is being run in a directory other than where the control file will reside, it is useful to pass pst_path as .. Default is None

None

Returns:

Type Description

pandas.DataFrame: the parameter data for the parameters that were removed.

Note

This method does not check for multiple occurrences of the same parameter name(s) in across template files so if you have the same parameter in multiple template files, this is not the method you are looking for

Example::

pst = pyemu.Pst(os.path.join("template", "my.pst"))
pst.remove_parameters(os.path.join("template","boring_zone_pars.dat.tpl"), pst_path=".")
pst.write(os.path.join("template", "my_new_with_less_pars.pst")

enforce_bounds()

enforce bounds violation

Note

cheap enforcement of simply bringing violators back in bounds

Example::

pst = pyemu.Pst("pest.pst")
pst.parrep("random.par")
pst.enforce_bounds()
pst.write("pest_rando.pst")

from_io_files(tpl_files, in_files, ins_files, out_files, pst_filename=None, pst_path=None) classmethod

create a Pst instance from model interface files.

Parameters:

Name Type Description Default
tpl_files [`str`]

list of template file names

required
in_files [`str`]

list of model input file names (pairs with template files)

required
ins_files [`str`]

list of instruction file names

required
out_files [`str`]

list of model output file names (pairs with instruction files)

required
pst_filename `str`

name of control file to write. If None, no file is written. Default is None

None
pst_path str

the path from the control file to the IO files. For example, if the control will be in the same directory as the IO files, then pst_path should be '.'. Default is None, which doesn't do any path manipulation on the I/O file names

None

Returns:

Type Description

Pst: new control file instance with parameter and observation names

found in tpl_files and ins_files, respectively.

Note

calls pyemu.helpers.pst_from_io_files()

Assigns generic values for parameter info. Tries to use INSCHEK to set somewhat meaningful observation values

all file paths are relatively to where python is running.

Example::

tpl_files = ["my.tpl"]
in_files = ["my.in"]
ins_files = ["my.ins"]
out_files = ["my.out"]
pst = pyemu.Pst.from_io_files(tpl_files,in_files,ins_files,out_files)
pst.control_data.noptmax = 0
pst.write("my.pst)

from_par_obs_names(par_names=['par1'], obs_names=['obs1']) classmethod

construct a shell Pst instance from parameter and observation names

Parameters:

Name Type Description Default
par_names [`str`]

list of parameter names. Default is [par1]

['par1']
obs_names [`str`]

list of observation names. Default is [obs1]

['obs1']
Note

While this method works, it does not make template or instruction files. Users are encouraged to use Pst.from_io_files() for more usefulness

Example::

par_names = ["par1","par2"]
obs_names = ["obs1","obs2"]
pst = pyemu.Pst.from_par_obs_names(par_names,obs_names)

get(par_names=None, obs_names=None)

get a new pst object with subset of parameters and/or observations

Parameters:

Name Type Description Default
par_names [`str`]

a list of parameter names to have in the new Pst instance. If None, all parameters are in the new Pst instance. Default is None

None
obs_names [`str`]

a list of observation names to have in the new Pst instance. If None, all observations are in the new Pst instance. Default is None

None

Returns:

Type Description

Pst: a new Pst instance

Note

passing par_names as None and obs_names as None effectively generates a copy of the current Pst

Does not modify model i/o files - this is just a method for performing pyemu operations

Example::

pst = pyemu.Pst("pest.pst")
new_pst = pst.get(pst.adj_par_names[0],pst.obs_names[:10])

get_adj_pars_at_bounds(frac_tol=0.01)

get list of adjustable parameter at/near bounds

Parameters:

Name Type Description Default
frac_tol 'float`

fractional tolerance of distance to bound. For upper bound, the value parubnd * (1-frac_tol) is used, lower bound uses parlbnd * (1.0 + frac_tol)

0.01

Returns:

Type Description

tuple containing:

  • [str]: list of parameters at/near lower bound
  • [str]: list of parameters at/near upper bound

Example::

pst = pyemu.Pst("pest.pst")
at_lb,at_ub = pst.get_adj_pars_at_bounds()
print("pars at lower bound",at_lb)

get_par_change_limits()

calculate the various parameter change limits used in pest.

Returns:

Type Description

pandas.DataFrame: a copy of Pst.parameter_data

with columns for relative and factor change limits

Note:

does not yet support absolute parameter change limits!

Works in control file values space (not log transformed space).  Also
adds columns for effective upper and lower which account for par bounds and the
value of parchglim

example::

pst = pyemu.Pst("pest.pst")
df = pst.get_par_change_limits()
print(df.chg_lower)

get_res_stats(nonzero=True)

get some common residual stats by observation group.

Parameters:

Name Type Description Default
nonzero `bool`

calculate stats using only nonzero-weighted observations. This may seem obsvious to most users, but you never know....

True

Returns:

Type Description

pd.DataFrame: a dataframe with columns for groups names and indices of statistic name.

Note

Stats are derived from the current obsvals, weights and grouping in Pst.observation_data and the modelled values in Pst.res. The key here is 'current' because if obsval, weights and/or groupings have changed in Pst.observation_data since the residuals file was generated then the current values for obsval, weight and group are used

the normalized RMSE is normalized against the obsval range (max - min)

Example::

pst = pyemu.Pst("pest.pst")
stats_df = pst.get_res_stats()
print(stats_df.loc["mae",:])

load(filename, parse_metadata=True)

entry point load the pest control file.

Parameters:

Name Type Description Default
filename `str`

pst filename

required
Note

This method is called from the Pst constructor unless the load arg is False.

parrep(parfile=None, enforce_bounds=True, real_name=None, noptmax=0, binary_ens_file=False)

replicates the pest parrep util. replaces the parval1 field in the parameter data section dataframe with values in a PEST parameter file or a single realization from an ensemble parameter csv file

Parameters:

Name Type Description Default
parfile `str`

parameter file to use. If None, try to find and use a parameter file that corresponds to the case name. If parfile has extension '.par' a single realization parameter file is used If parfile has extension '.csv' an ensemble parameter file is used which invokes real_name If parfile has extension '.jcb' a binary ensemble parameter file is used which invokes real_name Default is None

None
enforce_bounds `bool`

flag to enforce parameter bounds after parameter values are updated. This is useful because PEST and PEST++ round the parameter values in the par file, which may cause slight bound violations. Default is True

True
real_name `str` or `int`

name of the ensemble realization to use for updating the parval1 value in the parameter data section dataframe. If None, try using "base". If "base" not present, use the real_name with smallest index number. Ignored if parfile is of the PEST parameter file format (e.g. not en ensemble)

None
noptmax `int`

Value with which to update the pst.control_data.noptmax value Default is 0.

0
binary_ens_file `bool`

If True, use binary format to load ensemble file, else assume it's a CSV file

False

Example::

pst = pyemu.Pst("pest.pst")
pst.parrep("pest.1.base.par")
pst.control_data.noptmax = 0
pst.write("pest_1.pst")

plot(kind=None, **kwargs)

method to plot various parts of the control. This is sweet as!

Parameters:

Name Type Description Default
kind `str`

options are 'prior' (prior parameter histograms, '1to1' (line of equality and sim vs res), 'obs_v_sim' (time series using datetime suffix), 'phi_pie' (pie chart of phi components)

None
kwargs `dict`

optional args for plots that are passed to pyemu plot helpers and ultimately to matplotlib

{}
Note

Depending on 'kind' argument, a multipage pdf is written

Example::

pst = pyemu.Pst("my.pst")
pst.plot(kind="1to1") # requires Pst.res
pst.plot(kind="prior")
pst.plot(kind="phi_pie")

process_output_files(pst_path='.')

processing the model output files using the instruction files and existing model output files.

Parameters:

Name Type Description Default
pst_path `str`

relative path from where python is running to where the control file, instruction files and model output files are located. Default is "." (current python directory)

'.'

Returns:

Type Description

pandas.Series: model output values

Note

requires a complete set of model input files at relative path from where python is running to pst_path

Example::

pst = pyemu.Pst("pest.pst")
obsvals = pst.process_output_files()
print(obsvals)

proportional_weights(fraction_stdev=1.0, wmax=100.0, leave_zero=True)

setup weights inversely proportional to the observation value

Parameters:

Name Type Description Default
fraction_stdev `float`

the fraction portion of the observation val to treat as the standard deviation. set to 1.0 for inversely proportional. Default is 1.0

1.0
wmax `float`

maximum weight to allow. Default is 100.0

100.0
leave_zero `bool`

flag to leave existing zero weights. Default is True

True

Example::

pst = pyemu.Pst("pest.pst")
# set the weights of the observations to 20% of the observed value
pst.proportional_weights(fraction_stdev=0.2,wmax=10)
pst.write("pest_propo.pst")

rectify_pgroups()

synchronize parameter groups section with the parameter data section

Note

This method is called during Pst.write() to make sure all parameter groups named in * parameter data are included. This is so users don't have to manually keep this section up. This method can also be called during control file modifications to see what parameter groups are present and prepare for modifying the default values in the * parameter group section

Example::

pst = pyemu.Pst("my.pst")
pst.parameter_data.loc["par1","pargp"] = "new_group"
pst.rectify_groups()
pst.parameter_groups.loc["new_group","derinc"] = 1.0

rectify_pi()

rectify the prior information equation with the current state of the parameter_data dataframe.

Note

Equations that list fixed, tied or missing parameters are removed completely even if adjustable parameters are also listed in the equation. This method is called during Pst.write()

rename_observations(name_dict, pst_path='.', insmap=None)

rename observations in the control and instruction files

Parameters:

Name Type Description Default
name_dict `dict`

mapping of current to new names.

required
pst_path str

the path to the control file from where python is running. Default is "." (python is running in the same directory as the control file)

'.'
Note

This does a lot of string compare, so its gonna be slow as...

Example::

pst = pyemu.Pst(os.path.join("template","pest.pst"))
name_dict = {"obs1":"obs1_better_name"}
pst.rename_observations(name_dict,pst_path="template")

rename_parameters(name_dict, pst_path='.', tplmap=None)

rename parameters in the control and template files

Parameters:

Name Type Description Default
name_dict `dict`

mapping of current to new names.

required
pst_path str

the path to the control file from where python is running. Default is "." (python is running in the same directory as the control file)

'.'
Note

no attempt is made to maintain the length of the marker strings in the template files, so if your model is sensitive to changes in spacing in the template file(s), this is not a method for you

This does a lot of string compare, so its gonna be slow as...

Example::

pst = pyemu.Pst(os.path.join("template","pest.pst"))
name_dict = {"par1":"par1_better_name"}
pst.rename_parameters(name_dict,pst_path="template")

sanity_checks(forgive=False)

some basic check for strangeness

Parameters:

Name Type Description Default
forgive `bool`

flag to forgive (warn) for issues. Default is False

False
Note

checks for duplicate names, at least 1 adjustable parameter and at least 1 non-zero-weighted observation

Not nearly as comprehensive as pestchek

Example::

pst = pyemu.Pst("pest.pst")
pst.sanity_checks()

set_res(res)

reset the private Pst.res attribute.

Parameters:

Name Type Description Default
res

(pandas.DataFrame or str): something to use as Pst.res attribute. If res is str, a dataframe is read from file res

required

try_parse_name_metadata()

try to add meta data columns to parameter and observation data based on item names. Used with the PstFrom process.

Note

metadata is identified in key-value pairs that are separated by a colon. each key-value pair is separated from others by underscore

This works with PstFrom style long names

This method is called programmtically during Pst.load()

write(new_filename, version=None, check_interface=False)

main entry point to write a pest control file.

Parameters:

Name Type Description Default
new_filename `str`

name of the new pest control file

required
version `int`

flag for which version of control file to write (must be 1 or 2). if None, uses the number of pars to decide: if number of pars iis greater than 10,000, version 2 is used.

None
check_interface `bool`

flag to check the control file par and obs names against the names found in the template and instruction files. Default is False

False

Example::

pst = pyemu.Pst("my.pst")
pst.parrep("my.par")
pst.write(my_new.pst")
#write a version 2 control file
pst.write("my_new_v2.pst",version=2)

write_input_files(pst_path='.')

writes model input files using template files and current parval1 values.

Parameters:

Name Type Description Default
pst_path `str`

the path to where control file and template files reside. Default is '.'

'.'
Note

adds "parval1_trans" column to Pst.parameter_data that includes the effect of scale and offset

Example::

pst = pyemu.Pst("my.pst")

# load final parameter values
pst.parrep("my.par")

# write new model input files with final parameter values
pst.write_input_files()

write_obs_summary_table(filename=None, group_names=None)

write a stand alone observation summary latex table or Excel shet filename (str): filename. If None, use .par.tex to write as LaTeX. If filename extension is '.xls' or '.xlsx', tries to write as an Excel file. If filename is "none", no table is written Default is None

Parameters:

Name Type Description Default
filename `str`

filename. If filename is "none", no table is written. If None, use .obs.tex. If filename extension is '.xls' or '.xlsx', tries to write as an Excel file. Default is None

None
group_names `dict`

obs group names : table names. For example {"hds":"simulated groundwater level"}. Default is None

None

Returns:

Type Description

pandas.DataFrame: the summary observation group dataframe

Example::

pst = pyemu.Pst("my.pst")
pst.write_obs_summary_table(filename="obs.tex")

write_par_summary_table(filename=None, group_names=None, sigma_range=4.0, report_in_linear_space=False)

write a stand alone parameter summary latex table or Excel sheet

Parameters:

Name Type Description Default
filename `str`

filename. If None, use .par.tex to write as LaTeX. If filename extension is '.xls' or '.xlsx', tries to write as an Excel file. If filename is "none", no table is written Default is None

None
group_names `dict`

par group names : table names. For example {"w0":"well stress period 1"}. Default is None

None
sigma_range `float`

number of standard deviations represented by parameter bounds. Default is 4.0, implying 95% confidence bounds

4.0
report_in_linear_space `bool`

flag, if True, that reports all logtransformed values in linear space. This renders standard deviation meaningless, so that column is skipped

False

Returns:

Type Description

pandas.DataFrame: the summary parameter group dataframe

Example::

pst = pyemu.Pst("my.pst")
pst.write_par_summary_table(filename="par.tex")

PstFromFlopyModel

Bases: object

a monster helper class to setup a complex PEST interface around an existing MODFLOW-2005-family model.

Parameters:

Name Type Description Default
model `flopy.mbase`

a loaded flopy model instance. If model is an str, it is treated as a MODFLOW nam file (requires org_model_ws)

required
new_model_ws `str`

a directory where the new version of MODFLOW input files and PEST(++) files will be written

required
org_model_ws `str`

directory to existing MODFLOW model files. Required if model argument is an str. Default is None

None
pp_props [[`str`,[`int`]]]

pilot point multiplier parameters for grid-based properties. A nested list of grid-scale model properties to parameterize using name, iterable pairs. For 3D properties, the iterable is zero-based layer indices. For example, ["lpf.hk",[0,1,2,]] would setup pilot point multiplier parameters for layer property file horizontal hydraulic conductivity for model layers 1,2, and 3. For time-varying properties (e.g. recharge), the iterable is for zero-based stress period indices. For example, ["rch.rech",[0,4,10,15]] would setup pilot point multiplier parameters for recharge for stress period 1,5,11,and 16.

[]
const_props [[`str`,[`int`]]]

constant (uniform) multiplier parameters for grid-based properties. A nested list of grid-scale model properties to parameterize using name, iterable pairs. For 3D properties, the iterable is zero-based layer indices. For example, ["lpf.hk",[0,1,2,]] would setup constant (uniform) multiplier parameters for layer property file horizontal hydraulic conductivity for model layers 1,2, and 3. For time-varying properties (e.g. recharge), the iterable is for zero-based stress period indices. For example, ["rch.rech",[0,4,10,15]] would setup constant (uniform) multiplier parameters for recharge for stress period 1,5,11,and 16.

[]
temporal_list_props [[`str`,[`int`]]]

list-type input stress-period level multiplier parameters. A nested list of list-type input elements to parameterize using name, iterable pairs. The iterable is zero-based stress-period indices. For example, to setup multipliers for WEL flux and for RIV conductance, temporal_list_props = [["wel.flux",[0,1,2]],["riv.cond",None]] would setup multiplier parameters for well flux for stress periods 1,2 and 3 and would setup one single river conductance multiplier parameter that is applied to all stress periods

[]
spatial_list_props [[`str`,[`int`]]]

list-type input for spatial multiplier parameters. A nested list of list-type elements to parameterize using names (e.g. [["riv.cond",0],["wel.flux",1] to setup up cell-based parameters for each list-type element listed. These multiplier parameters are applied across all stress periods. For this to work, there must be the same number of entries for all stress periods. If more than one list element of the same type is in a single cell, only one parameter is used to multiply all lists in the same cell.

[]
grid_props [[`str`,[`int`]]]

grid-based (every active model cell) multiplier parameters. A nested list of grid-scale model properties to parameterize using name, iterable pairs. For 3D properties, the iterable is zero-based layer indices (e.g., ["lpf.hk",[0,1,2,]] would setup a multiplier parameter for layer property file horizontal hydraulic conductivity for model layers 1,2, and 3 in every active model cell). For time-varying properties (e.g. recharge), the iterable is for zero-based stress period indices. For example, ["rch.rech",[0,4,10,15]] would setup grid-based multiplier parameters in every active model cell for recharge for stress period 1,5,11,and 16.

[]
sfr_pars `bool`

setup parameters for the stream flow routing modflow package. If list is passed it defines the parameters to set up.

False
grid_geostruct `pyemu.geostats.GeoStruct`

the geostatistical structure to build the prior parameter covariance matrix elements for grid-based parameters. If None, a generic GeoStruct is created using an "a" parameter that is 10 times the max cell size. Default is None

None
pp_space `int`

number of grid cells between pilot points. If None, use the default in pyemu.pp_utils.setup_pilot_points_grid. Default is None

None
zone_props [[`str`,[`int`]]]

zone-based multiplier parameters. A nested list of zone-based model properties to parameterize using name, iterable pairs. For 3D properties, the iterable is zero-based layer indices (e.g., ["lpf.hk",[0,1,2,]] would setup a multiplier parameter for layer property file horizontal hydraulic conductivity for model layers 1,2, and 3 for unique zone values in the ibound array. For time-varying properties (e.g. recharge), the iterable is for zero-based stress period indices. For example, ["rch.rech",[0,4,10,15]] would setup zone-based multiplier parameters for recharge for stress period 1,5,11,and 16.

[]
pp_geostruct `pyemu.geostats.GeoStruct`

the geostatistical structure to use for building the prior parameter covariance matrix for pilot point parameters. If None, a generic GeoStruct is created using pp_space and grid-spacing information. Default is None

None
par_bounds_dict `dict`

a dictionary of model property/boundary condition name, upper-lower bound pairs. For example, par_bounds_dict = {"hk":[0.01,100.0],"flux":[0.5,2.0]} would set the bounds for horizontal hydraulic conductivity to 0.001 and 100.0 and set the bounds for flux parameters to 0.5 and 2.0. For parameters not found in par_bounds_dict, pyemu.helpers.wildass_guess_par_bounds_dict is used to set somewhat meaningful bounds. Default is None

None
temporal_list_geostruct `pyemu.geostats.GeoStruct`

the geostastical structure to build the prior parameter covariance matrix for time-varying list-type multiplier parameters. This GeoStruct express the time correlation so that the 'a' parameter is the length of time that boundary condition multiplier parameters are correlated across. If None, then a generic GeoStruct is created that uses an 'a' parameter of 3 stress periods. Default is None

None
spatial_list_geostruct `pyemu.geostats.GeoStruct`

the geostastical structure to build the prior parameter covariance matrix for spatially-varying list-type multiplier parameters. If None, a generic GeoStruct is created using an "a" parameter that is 10 times the max cell size. Default is None.

None
remove_existing `bool`

a flag to remove an existing new_model_ws directory. If False and new_model_ws exists, an exception is raised. If True and new_model_ws exists, the directory is destroyed - user beware! Default is False.

False
k_zone_dict `dict`

a dictionary of zero-based layer index, zone array pairs. e.g. {lay: np.2darray} Used to override using ibound zones for zone-based parameterization. If None, use ibound values greater than zero as zones. Alternatively a dictionary of dictionaries can be passed to allow different zones to be defined for different parameters. e.g. {"upw.hk" {lay: np.2darray}, "extra.rc11" {lay: np.2darray}} or {"hk" {lay: np.2darray}, "rc11" {lay: np.2darray}}

None
use_pp_zones `bool`

a flag to use ibound zones (or k_zone_dict, see above) as pilot point zones. If False, ibound values greater than zero are treated as a single zone for pilot points. Default is False

False
obssim_smp_pairs ([[`str`,`str`]]

a list of observed-simulated PEST-type SMP file pairs to get observations from and include in the control file. Default is []

required
external_tpl_in_pairs ([[`str`,`str`]]

a list of existing template file, model input file pairs to parse parameters from and include in the control file. Default is []

required
external_ins_out_pairs ([[`str`,`str`]]

a list of existing instruction file, model output file pairs to parse observations from and include in the control file. Default is []

required
extra_pre_cmds [`str`]

a list of preprocessing commands to add to the forward_run.py script commands are executed with os.system() within forward_run.py. Default is None.

None
redirect_forward_output `bool`

flag for whether to redirect forward model output to text files (True) or allow model output to be directed to the screen (False). Default is True

True
extra_post_cmds [`str`]

a list of post-processing commands to add to the forward_run.py script. Commands are executed with os.system() within forward_run.py. Default is None.

None
tmp_files [`str`]

a list of temporary files that should be removed at the start of the forward run script. Default is [].

None
model_exe_name `str`

binary name to run modflow. If None, a default from flopy is used, which is dangerous because of the non-standard binary names (e.g. MODFLOW-NWT_x64, MODFLOWNWT, mfnwt, etc). Default is None.

None
build_prior `bool`

flag to build prior covariance matrix. Default is True

True
sfr_obs `bool`

flag to include observations of flow and aquifer exchange from the sfr ASCII output file

False
hfb_pars `bool`

add HFB parameters. uses pyemu.gw_utils.write_hfb_template(). the resulting HFB pars have parval1 equal to the values in the original file and use the spatial_list_geostruct to build geostatistical covariates between parameters

False
kl_props [[`str`,[`int`]]]

karhunen-loeve based multiplier parameters. A nested list of KL-based model properties to parameterize using name, iterable pairs. For 3D properties, the iterable is zero-based layer indices (e.g., ["lpf.hk",[0,1,2,]] would setup a multiplier parameter for layer property file horizontal hydraulic conductivity for model layers 1,2, and 3 for unique zone values in the ibound array. For time-varying properties (e.g. recharge), the iterable is for zero-based stress period indices. For example, ["rch.rech",[0,4,10,15]] would setup zone-based multiplier parameters for recharge for stress period 1,5,11,and 16.

None
kl_num_eig `int`

the number of KL-based eigenvector multiplier parameters to use for each KL parameter set. default is 100

100
kl_geostruct `pyemu.geostats.Geostruct`

the geostatistical structure to build the prior parameter covariance matrix elements for KL-based parameters. If None, a generic GeoStruct is created using an "a" parameter that is 10 times the max cell size. Default is None

None

Note:

Setup up multiplier parameters for an existing MODFLOW model.

Does all kinds of coolness like building a
meaningful prior, assigning somewhat meaningful parameter groups and
bounds, writes a forward_run.py script with all the calls need to
implement multiplier parameters, run MODFLOW and post-process.

While this class does work, the newer `PstFrom` class is a more pythonic
implementation

build_prior(fmt='ascii', filename=None, droptol=None, chunk=None, sigma_range=6)

build and optionally save the prior parameter covariance matrix.

Parameters:

Name Type Description Default
fmt `str`

the format to save the cov matrix. Options are "ascii","binary","uncfile", "coo". Default is "ascii". If "none" (lower case string, not None), then no file is created.

'ascii'
filename `str`

the filename to save the prior cov matrix to. If None, the name is formed using model nam_file name. Default is None.

None
droptol `float`

tolerance for dropping near-zero values when writing compressed binary. Default is None.

None
chunk `int`

chunk size to write in a single pass - for binary only. Default is None (no chunking).

None
sigma_range `float`

number of standard deviations represented by the parameter bounds. Default is 6.

6

Returns:

Type Description

pyemu.Cov: the full prior parameter covariance matrix, generated by processing parameters by

groups

build_pst(filename=None)

build the pest control file using the parameters and observations.

Parameters:

Name Type Description Default
filename `str`

the filename to save the control file to. If None, the name if formed from the model namfile name. Default is None. The control is saved in the PstFromFlopy.m.model_ws directory.

None

Note:

calls pyemu.Pst.from_io_files

calls PESTCHEK

draw(num_reals=100, sigma_range=6, use_specsim=False, scale_offset=True)

draw from the geostatistically-implied parameter covariance matrix

Parameters:

Name Type Description Default
num_reals `int`

number of realizations to generate. Default is 100

100
sigma_range `float`

number of standard deviations represented by the parameter bounds. Default is 6.

6
use_specsim `bool`

flag to use spectral simulation for grid-based parameters. Requires a regular grid but is wicked fast. Default is False

False
scale_offset `bool`

flag to apply scale and offset to parameter bounds when calculating variances - this is passed through to pyemu.Cov.from_parameter_data. Default is True.

True
Note

operates on parameters by groups to avoid having to construct a very large covariance matrix for problems with more the 30K parameters.

uses helpers.geostatitical_draw()

Returns:

Type Description

pyemu.ParameterEnsemble: The realized parameter ensemble

write_forward_run()

write the forward run script forward_run.py

Note

This method can be called repeatedly, especially after any changed to the pre- and/or post-processing routines.

Schur

Bases: LinearAnalysis

FOSM-based uncertainty and data-worth analysis

Parameters:

Name Type Description Default
jco varies

something that can be cast or loaded into a pyemu.Jco. Can be a str for a filename or pyemu.Matrix/pyemu.Jco object.

required
pst varies

something that can be cast into a pyemu.Pst. Can be an str for a filename or an existing pyemu.Pst. If None, a pst filename is sought with the same base name as the jco argument (if passed)

required
parcov varies

prior parameter covariance matrix. If str, a filename is assumed and the prior parameter covariance matrix is loaded from a file using the file extension (".jcb"/".jco" for binary, ".cov"/".mat" for PEST-style ASCII matrix, or ".unc" for uncertainty files). If None, the prior parameter covariance matrix is constructed from the parameter bounds in LinearAnalysis.pst. Can also be a pyemu.Cov instance

required
obscov varies

observation noise covariance matrix. If str, a filename is assumed and the noise covariance matrix is loaded from a file using the file extension (".jcb"/".jco" for binary, ".cov"/".mat" for PEST-style ASCII matrix, or ".unc" for uncertainty files). If None, the noise covariance matrix is constructed from the observation weights in LinearAnalysis.pst. Can also be a pyemu.Cov instance

required
forecasts varies

forecast sensitivity vectors. If str, first an observation name is assumed (a row in LinearAnalysis.jco). If that is not found, a filename is assumed and predictions are loaded from a file using the file extension. If [str], a list of observation names is assumed. Can also be a pyemu.Matrix instance, a numpy.ndarray or a collection. Note if the PEST++ option "++forecasts()" is set in the pest control file (under the pyemu.Pst.pestpp_options dictionary), then there is no need to pass this argument (unless you want to analyze different forecasts) of pyemu.Matrix or numpy.ndarray.

required
ref_var float

reference variance. Default is 1.0

required
verbose `bool`

controls screen output. If str, a filename is assumed and and log file is written.

required
sigma_range `float`

defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution. This arg is only used if constructing parcov from parameter bounds.

required
scale_offset `bool`

flag to apply parameter scale and offset to parameter bounds when calculating prior parameter covariance matrix from bounds. This arg is onlyused if constructing parcov from parameter bounds.Default is True.

required
Note

This class is the primary entry point for FOSM-based uncertainty and dataworth analyses

This class replicates and extends the behavior of the PEST PREDUNC utilities.

Example::

#assumes "my.pst" exists
sc = pyemu.Schur(jco="my.jco",forecasts=["fore1","fore2"])
print(sc.get_forecast_summary())
print(sc.get_parameter_contribution())

adj_par_names property

adjustable parameter names

Returns:

Type Description

['str`]: list of adjustable parameter names

Note

if LinearAnalysis.pst is None, returns LinearAnalysis.jco.col_names

fehalf property

Karhunen-Loeve scaling matrix attribute.

Returns:

Type Description

pyemu.Matrix: the Karhunen-Loeve scaling matrix based on the prior

parameter covariance matrix

forecast_names property

get the forecast (aka prediction) names

Returns:

Type Description
[`str`]

list of forecast names

forecasts property

the forecast sentivity vectors attribute

Returns:

Type Description

pyemu.Matrix: a matrix of forecast (prediction) sensitivity vectors (column wise)

forecasts_iter property

forecast (e.g. prediction) sensitivity vectors iterator

Returns:

Type Description

iterator: iterator on forecasts (e.g. predictions) sensitivity vectors (matrix)

Note

This is used for processing huge numbers of predictions

This is a synonym for LinearAnalysis.predictions_iter()

jco property

the jacobian matrix attribute

Returns:

Type Description

pyemu.Jco: the jacobian matrix attribute

mle_covariance property

maximum likelihood parameter covariance matrix.

Returns:

Type Description

pyemu.Matrix: maximum likelihood parameter covariance matrix

mle_parameter_estimate property

maximum likelihood parameter estimate.

Returns:

Type Description

pandas.Series: the maximum likelihood parameter estimates

nnz_obs_names property

non-zero-weighted observation names

Returns:

Type Description

['str`]: list of non-zero-weighted observation names

Note

if LinearAnalysis.pst is None, returns LinearAnalysis.jco.row_names

obscov property

get the observation noise covariance matrix attribute

Returns:

Type Description

pyemu.Cov: a reference to the LinearAnalysis.obscov attribute

parcov property

get the prior parameter covariance matrix attribute

Returns:

Type Description

pyemu.Cov: a reference to the LinearAnalysis.parcov attribute

posterior_forecast property

posterior forecast (e.g. prediction) variance(s)

Returns:

Type Description

dict: dictionary of forecast names and FOSM-estimated posterior

variances

Note

Sames as LinearAnalysis.posterior_prediction

See Schur.get_forecast_summary() for a dataframe-based container of prior and posterior variances

posterior_parameter property

posterior parameter covariance matrix.

Returns:

Type Description

pyemu.Cov: the posterior parameter covariance matrix

Example::

sc = pyemu.Schur(jco="my.jcb")
post_cov = sc.posterior_parameter
post_cov.to_ascii("post.cov")

posterior_prediction property

posterior prediction (e.g. forecast) variance estimate(s)

Returns:

Type Description

dict: dictionary of forecast names and FOSM-estimated posterior

variances

Note: sames as LinearAnalysis.posterior_forecast

 See `Schur.get_forecast_summary()` for a dataframe-based container of prior and posterior
 variances

predictions property

the prediction (aka forecast) sentivity vectors attribute

Returns:

Type Description

pyemu.Matrix: a matrix of prediction sensitivity vectors (column wise)

predictions_iter property

prediction sensitivity vectors iterator

Returns:

Type Description

iterator: iterator on prediction sensitivity vectors (matrix)

Note

this is used for processing huge numbers of predictions

prior_forecast property

prior forecast (e.g. prediction) variances

Returns:

Type Description

dict: a dictionary of forecast name, prior variance pairs

prior_parameter property

prior parameter covariance matrix

Returns:

Type Description

pyemu.Cov: prior parameter covariance matrix

prior_prediction property

prior prediction (e.g. forecast) variances

Returns:

Type Description

dict: a dictionary of prediction name, prior variance pairs

pst property

the pst attribute

Returns:

Type Description

pyemu.Pst: the pst attribute

qhalf property

square root of the cofactor matrix attribute. Create the attribute if it has not yet been created

Returns:

Type Description

pyemu.Matrix: square root of the cofactor matrix

qhalfx property

half normal matrix attribute.

Returns:

Type Description

pyemu.Matrix: half normal matrix attribute

xtqx property

normal matrix attribute.

Returns:

Type Description

pyemu.Matrix: normal matrix attribute

__contribution_from_parameters(parameter_names)

private method get the prior and posterior uncertainty reduction as a result of some parameter becoming perfectly known

__fromfile(filename, astype=None)

a private method to deduce and load a filename into a matrix object. Uses extension: 'jco' or 'jcb': binary, 'mat','vec' or 'cov': ASCII, 'unc': pest uncertainty file.

__load_jco()

private method to set the jco attribute from a file or a matrix object

__load_obscov()

private method to set the obscov attribute from: a pest control file (observation weights) a pst object a matrix object an uncert file an ascii matrix file

__load_parcov()

private method to set the parcov attribute from: a pest control file (parameter bounds) a pst object a matrix object an uncert file an ascii matrix file

__load_predictions()

private method set the predictions attribute from

mixed list of row names, matrix files and ndarrays a single row name an ascii file

can be none if only interested in parameters.

__load_pst()

private method set the pst attribute

adjust_obscov_resfile(resfile=None)

reset the elements of obscov by scaling the implied weights based on the phi components in res_file so that the total phi is equal to the number of non-zero weights.

Parameters:

Name Type Description Default
resfile `str`

residual file to use. If None, residual file with case name is sought. default is None

None
Note

calls pyemu.Pst.adjust_weights_resfile()

apply_karhunen_loeve_scaling()

apply karhuene-loeve scaling to the jacobian matrix.

Note:

This scaling is not necessary for analyses using Schur's
complement, but can be very important for error variance
analyses.  This operation effectively transfers prior knowledge
specified in the parcov to the jacobian and reset parcov to the
identity matrix.

clean()

drop regularization and prior information observation from the jco

drop_prior_information()

drop the prior information from the jco and pst attributes

get(par_names=None, obs_names=None, astype=None)

method to get a new LinearAnalysis class using a subset of parameters and/or observations

Parameters:

Name Type Description Default
par_names [`'str`]

par names for new object

None
obs_names [`'str`]

obs names for new object

None
astype `pyemu.Schur` or `pyemu.ErrVar`

type to cast the new object. If None, return type is same as self

None

Returns:

Type Description

LinearAnalysis: new instance

get_added_obs_group_importance(reset_zero_weight=1.0)

A dataworth method to analyze the posterior uncertainty as a result of gaining existing observations, tested by observation groups

Returns:

Type Description

pandas.DataFrame: A dataframe with index of observation group names and columns

of forecast names. The values in the dataframe are the posterior

variances of the forecasts resulting from gaining the information

contained in each observation group. One row in the dataframe is labeled "base" - this is

posterior forecast variance resulting from the notional calibration with the

non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should

either not change or decrease as a result of gaining new observations. The magnitude

of the decrease represents the worth of the potential new observation(s) in each group

being tested.

Note

Observations in Schur.pst with zero weights are not included in the analysis unless reset_zero_weight is a float greater than zero. In most cases, users will want to reset zero-weighted observations as part dataworth testing process.

Example::

sc = pyemu.Schur("my.jco")
df = sc.get_added_obs_group_importance(reset_zero_weight=1.0)

get_added_obs_importance(obslist_dict=None, base_obslist=None, reset_zero_weight=1.0)

A dataworth method to analyze the posterior uncertainty as a result of gathering some additional observations

Parameters:

Name Type Description Default
obslist_dict `dict`

a nested dictionary-list of groups of observations that are to be treated as gained/collected. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None

None
base_obslist [`str`]

observation names to treat as the "existing" observations. The values of obslist_dict will be added to this list during each test. If None, then the values in each obslist_dict entry will be treated as the entire calibration dataset. That is, there are no existing observations. Default is None. Standard practice would be to pass this argument as pyemu.Schur.pst.nnz_obs_names so that existing, non-zero-weighted observations are accounted for in evaluating the worth of new yet-to-be-collected observations.

None

Returns:

Type Description

pandas.DataFrame: a dataframe with row labels (index) of obslist_dict.keys() and

columns of forecast names. The values in the dataframe are the

posterior variance of the forecasts resulting from notional inversion

using the observations in obslist_dict[key value] plus the observations

in base_obslist (if any). One row in the dataframe is labeled "base" - this is

posterior forecast variance resulting from the notional calibration with the

observations in base_obslist (if base_obslist is None, then the "base" row is the

prior forecast variance). Conceptually, the forecast variance should either not change or

decrease as a result of gaining additional observations. The magnitude of the decrease

represents the worth of the potential new observation(s) being tested.

Note

Observations listed in base_obslist is required to only include observations with weight not equal to zero. If zero-weighted observations are in base_obslist an exception will be thrown. In most cases, users will want to reset zero-weighted observations as part dataworth testing process. If reset_zero_weights == 0, no weights adjustments will be made - this is most appropriate if different weights are assigned to the added observation values in Schur.pst

Example::

sc = pyemu.Schur("my.jco")
obslist_dict = {"hds":["head1","head2"],"flux":["flux1","flux2"]}
df = sc.get_added_obs_importance(obslist_dict=obslist_dict,
                                 base_obslist=sc.pst.nnz_obs_names,
                                 reset_zero_weight=1.0)

get_conditional_instance(parameter_names)

get a new pyemu.Schur instance that includes conditional update from some parameters becoming known perfectly

Parameters:

Name Type Description Default
parameter_names [`str`]

list of parameters that are to be treated as notionally perfectly known

required

Returns:

Type Description

pyemu.Schur: a new Schur instance conditional on perfect knowledge

of some parameters. The new instance has an updated parcov that is less

the names listed in parameter_names.

Note

This method is primarily for use by the LinearAnalysis.get_parameter_contribution() dataworth method.

get_cso_dataframe()

get a dataframe of composite observation sensitivity, as returned by PEST in the seo file.

Returns:

Type Description

pandas.DataFrame: dataframe of observation names and composite observation

sensitivity

Note

That this formulation deviates slightly from the PEST documentation in that the values are divided by (npar-1) rather than by (npar).

The equation is cso_j = ((Q^1/2JJ^T*Q^1/2)^1/2)_jj/(NPAR-1)

get_forecast_summary()

summary of the FOSM-based forecast uncertainty (variance) estimate(s)

Returns:

Type Description

pandas.DataFrame: dataframe of prior,posterior variances and percent

uncertainty reduction of each forecast (e.g. prediction)

Note

This is the primary entry point for accessing forecast uncertainty estimates "precent_reduction" column in dataframe is calculated as 100.0 * (1.0 - (posterior variance / prior variance)

Example::

sc = pyemu.Schur(jco="my.jcb",forecasts=["fore1","fore2"])
df = sc.get_parameter_summary()
df.loc[:,["prior","posterior"]].plot(kind="bar")
plt.show()
df.percent_reduction.plot(kind="bar")
plt.show()

get_obs_competition_dataframe()

get the observation competition stat a la PEST utility

Returns:

Type Description

pandas.DataFrame: a dataframe of observation names by

observation names with values equal to the PEST

competition statistic

get_obs_group_dict()

get a dictionary of observations grouped by observation group name

Returns:

Type Description

dict: a dictionary of observations grouped by observation group name.

Useful for dataworth processing in pyemu.Schur

Note

only includes observations that are listed in Schur.jco.row_names

Example::

sc = pyemu.Schur("my.jco")
obsgrp_dict = sc.get_obs_group_dict()
df = sc.get_removed_obs_importance(obsgrp_dict=obsgrp_dict)

get_par_contribution(parlist_dict=None, include_prior_results=False)

A dataworth method to get a dataframe the prior and posterior uncertainty reduction as a result of some parameter becoming perfectly known

Parameters:

Name Type Description Default
parlist_dict

(dict, optional): a nested dictionary-list of groups of parameters that are to be treated as perfectly known. key values become row labels in returned dataframe. If None, each adjustable parameter is sequentially treated as known and the returned dataframe has row labels for each adjustable parameter

required
include_prior_results `bool`

flag to return a multi-indexed dataframe with both conditional prior and posterior forecast uncertainty estimates. This is because the notional learning about parameters potentially effects both the prior and posterior forecast uncertainty estimates. If False, only posterior results are returned. Default is False

False

Returns:

Type Description

pandas.DataFrame: a dataframe that summarizes the parameter contribution

dataworth analysis. The dataframe has index (row labels) of the keys in parlist_dict

and a column labels of forecast names. The values in the dataframe

are the posterior variance of the forecast conditional on perfect

knowledge of the parameters in the values of parlist_dict. One row in the

dataframe will be labeled base - this is the forecast uncertainty estimates

that include the effects of all adjustable parameters. Percent decreases in

forecast uncertainty can be calculated by differencing all rows against the

"base" row. Varies depending on include_prior_results.

Note

This is the primary dataworth method for assessing the contribution of one or more parameters to forecast uncertainty.

Example::

sc = pyemu.Schur(jco="my.jco")
parlist_dict = {"hk":["hk1","hk2"],"rech"["rech1","rech2"]}
df = sc.get_par_contribution(parlist_dict=parlist_dict)

get_par_css_dataframe()

get a dataframe of composite scaled sensitivities. Includes both PEST-style and Hill-style.

Returns:

Type Description

pandas.DataFrame: a dataframe of parameter names, PEST-style and

Hill-style composite scaled sensitivity

get_par_group_contribution(include_prior_results=False)

A dataworth method to get the forecast uncertainty contribution from each parameter group

Parameters:

Name Type Description Default
include_prior_results `bool`

flag to return a multi-indexed dataframe with both conditional prior and posterior forecast uncertainty estimates. This is because the notional learning about parameters potentially effects both the prior and posterior forecast uncertainty estimates. If False, only posterior results are returned. Default is False

False

Returns:

`pandas.DataFrame`: a dataframe that summarizes the parameter contribution analysis.
The dataframe has index (row labels) that are the parameter group names
and a column labels of forecast names.  The values in the dataframe
are the posterior variance of the forecast conditional on perfect
knowledge of the adjustable parameters in each parameter group.  One
row is labelled "base" - this is the variance of the forecasts that includes
the effects of all adjustable parameters. Varies depending on `include_prior_results`.
Note

This method is just a thin wrapper around get_contribution_dataframe() - this method automatically constructs the parlist_dict argument where the keys are the group names and the values are the adjustable parameters in the groups

Example::

sc = pyemu.Schur(jco="my.jco")
df = sc.get_par_group_contribution()

get_parameter_summary()

summary of the FOSM-based parameter uncertainty (variance) estimate(s)

Returns:

Type Description

pandas.DataFrame: dataframe of prior,posterior variances and percent

uncertainty reduction of each parameter

Note

This is the primary entry point for accessing parameter uncertainty estimates

The "Prior" column in dataframe is the diagonal of LinearAnalysis.parcov "precent_reduction" column in dataframe is calculated as 100.0 * (1.0 - (posterior variance / prior variance)

Example::

sc = pyemu.Schur(jco="my.jcb",forecasts=["fore1","fore2"])
df = sc.get_parameter_summary()
df.loc[:,["prior","posterior"]].plot(kind="bar")
plt.show()
df.percent_reduction.plot(kind="bar")
plt.show()

get_removed_obs_group_importance(reset_zero_weight=None)

A dataworth method to analyze the posterior uncertainty as a result of losing existing observations, tested by observation groups

Returns:

Type Description

pandas.DataFrame: A dataframe with index of observation group names and columns

of forecast names. The values in the dataframe are the posterior

variances of the forecasts resulting from losing the information

contained in each observation group. One row in the dataframe is labeled "base" - this is

posterior forecast variance resulting from the notional calibration with the

non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should

either not change or increase as a result of losing existing observations. The magnitude

of the increase represents the worth of the existing observation(s) in each group being tested.

Example::

sc = pyemu.Schur("my.jco")
df = sc.get_removed_obs_group_importance()

get_removed_obs_importance(obslist_dict=None, reset_zero_weight=None)

A dataworth method to analyze the posterior uncertainty as a result of losing some existing observations

Parameters:

Name Type Description Default
obslist_dict `dict`

a nested dictionary-list of groups of observations that are to be treated as lost. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None

None

Returns:

Name Type Description

pandas.DataFrame: A dataframe with index of obslist_dict.keys() and columns

of forecast names. The values in the dataframe are the posterior

variances of the forecasts resulting from losing the information

contained in obslist_dict[key value]. One row in the dataframe is labeled "base" - this is

posterior forecast variance resulting from the notional calibration with the

non-zero-weighed observations in Schur.pst. Conceptually, the forecast variance should

either not change or increase as a result of losing existing observations. The magnitude

of the increase represents the worth of the existing observation(s) being tested.

Note

All observations that may be evaluated as removed must have non-zero weight

Example::

sc = pyemu.Schur("my.jco")
df = sc.get_removed_obs_importance()

next_most_important_added_obs(forecast=None, niter=3, obslist_dict=None, base_obslist=None, reset_zero_weight=1.0)

find the most important observation(s) by sequentially evaluating the importance of the observations in obslist_dict.

Parameters:

Name Type Description Default
forecast `str`

name of the forecast to use in the ranking process. If more than one forecast has been listed, this argument is required. This is because the data worth must be ranked with respect to the variance reduction for a single forecast

None
niter `int`

number of sequential dataworth testing iterations. Default is 3

3
obslist_dict `dict`

a nested dictionary-list of groups of observations that are to be treated as gained/collected. key values become row labels in returned dataframe. If None, then every zero-weighted observation is tested sequentially. Default is None

None
base_obslist [`str`]

observation names to treat as the "existing" observations. The values of obslist_dict will be added to this list during each test. If None, then the values in each obslist_dict entry will be treated as the entire calibration dataset. That is, there are no existing observations. Default is None. Standard practice would be to pass this argument as pyemu.Schur.pst.nnz_obs_names so that existing, non-zero-weighted observations are accounted for in evaluating the worth of new yet-to-be-collected observations.

None

Returns:

Type Description

pandas.DataFrame: a dataFrame with columns of obslist_dict key for each iteration

the yields the largest variance reduction for the named forecast. Columns are forecast

variance percent reduction for each iteration (percent reduction compared to initial "base"

case with all non-zero weighted observations included in the notional calibration)

Note

The most important observations from each iteration is added to base_obslist and removed obslist_dict for the next iteration. In this way, the added observation importance values include the conditional information from the last iteration.

Example::

sc = pyemu.Schur(jco="my.jco")
df = sc.next_most_important_added_obs(forecast="fore1",base_obslist=sc.pst.nnz_obs_names)

next_most_par_contribution(niter=3, forecast=None, parlist_dict=None)

find the parameter(s) contributing most to posterior forecast by sequentially evaluating the contribution of parameters in parlist_dict.

Parameters:

Name Type Description Default
forecast `str`

name of the forecast to use in the ranking process. If more than one forecast has been listed, this argument is required. This is because the data worth must be ranked with respect to the variance reduction for a single forecast

None
niter `int`

number of sequential dataworth testing iterations. Default is 3

3
parlist_dict

dict a nested dictionary-list of groups of parameters that are to be treated as perfectly known. key values become row labels in dataframe

required
parlist_dict `dict`

a nested dictionary-list of groups of parameters that are to be treated as perfectly known (zero uncertainty). key values become row labels in returned dataframe. If None, then every adustable parameter is tested sequentially. Default is None. Conceptually, the forecast variance should either not change or decrease as a result of knowing parameter perfectly. The magnitude of the decrease represents the worth of gathering information about the parameter(s) being tested.

None
Note

The largest contributing parameters from each iteration are treated as known perfectly for the remaining iterations. In this way, the next iteration seeks the next most influential group of parameters.

Returns:

Type Description

pandas.DataFrame: a dataframe with index of iteration number and columns

of parlist_dict.keys(). The values are the results of the knowing

each parlist_dict entry expressed as posterior variance reduction

reset_obscov(arg=None)

reset the obscov attribute to None

Parameters:

Name Type Description Default
arg `str` or `pyemu.Matrix`

the value to assign to the obscov attribute. If None, the private __obscov attribute is cleared but not reset

None

reset_parcov(arg=None)

reset the parcov attribute to None

Parameters:

Name Type Description Default
arg `str` or `pyemu.Matrix`

the value to assign to the parcov attribute. If None, the private __parcov attribute is cleared but not reset

None

reset_pst(arg)

reset the LinearAnalysis.pst attribute

Parameters:

Name Type Description Default
arg `str` or `pyemu.Pst`

the value to assign to the pst attribute

required

SpatialReference

Bases: object

a class to locate a structured model grid in x-y space. Lifted wholesale from Flopy, and preserved here... ...maybe slightly over-engineered for here

Args:

delr (`numpy ndarray`): the model discretization delr vector (An array of spacings along a row)
delc (`numpy ndarray`): the model discretization delc vector (An array of spacings along a column)
lenuni (`int`): the length units flag from the discretization package. Default is 2.
xul (`float`): The x coordinate of the upper left corner of the grid. Enter either xul and yul or xll and yll.
yul (`float`): The y coordinate of the upper left corner of the grid. Enter either xul and yul or xll and yll.
xll (`float`): The x coordinate of the lower left corner of the grid. Enter either xul and yul or xll and yll.
yll (`float`): The y coordinate of the lower left corner of the grid. Enter either xul and yul or xll and yll.
rotation (`float`): The counter-clockwise rotation (in degrees) of the grid
proj4_str (`str`): a PROJ4 string that identifies the grid in space. warning: case sensitive!
units (`string`): Units for the grid.  Must be either "feet" or "meters"
epsg (`int`): EPSG code that identifies the grid in space. Can be used in lieu of
    proj4. PROJ4 attribute will auto-populate if there is an internet
    connection(via get_proj4 method).
    See https://www.epsg-registry.org/ or spatialreference.org
length_multiplier (`float`): multiplier to convert model units to spatial reference units.
    delr and delc above will be multiplied by this value. (default=1.)

bounds property

Return bounding box in shapely order.

length_multiplier property

Attempt to identify multiplier for converting from model units to sr units, defaulting to 1.

vertices property

Returns a list of vertices for

get_extent()

Get the extent of the rotated and offset grid

get_grid_lines()

Get the grid lines as a list

get_ij(x, y)

Return the row and column of a point or sequence of points in real-world coordinates.

Parameters:

Name Type Description Default
x `float`

scalar or sequence of x coordinates

required
y `float`

scalar or sequence of y coordinates

required

Returns:

Type Description

tuple of

  • int : row or sequence of rows (zero-based)
  • int : column or sequence of columns (zero-based)

get_vertices(i, j)

Get vertices for a single cell or sequence if i, j locations.

get_xcenter_array()

Return a numpy one-dimensional float array that has the cell center x coordinate for every column in the grid in model space - not offset or rotated.

get_xedge_array()

Return a numpy one-dimensional float array that has the cell edge x coordinates for every column in the grid in model space - not offset or rotated. Array is of size (ncol + 1)

get_ycenter_array()

Return a numpy one-dimensional float array that has the cell center x coordinate for every row in the grid in model space - not offset of rotated.

get_yedge_array()

Return a numpy one-dimensional float array that has the cell edge y coordinates for every row in the grid in model space - not offset or rotated. Array is of size (nrow + 1)

load(namefile=None, reffile='usgs.model.reference') staticmethod

Attempts to load spatial reference information from the following files (in order): 1) usgs.model.reference 2) NAM file (header comment) 3) SpatialReference.default dictionary

read_usgs_model_reference_file(reffile='usgs.model.reference') staticmethod

read spatial reference info from the usgs.model.reference file https://water.usgs.gov/ogw/policy/gw-model/modelers-setup.html

rotate(x, y, theta, xorigin=0.0, yorigin=0.0) staticmethod

Given x and y array-like values calculate the rotation about an arbitrary origin and then return the rotated coordinates. theta is in degrees.

set_spatialreference(xul=None, yul=None, xll=None, yll=None, rotation=0.0)

set spatial reference - can be called from model instance

transform(x, y, inverse=False)

Given x and y array-like values, apply rotation, scale and offset, to convert them from model coordinates to real-world coordinates.

write_gridspec(filename)

write a PEST-style grid specification file

apply_array_pars(arr_par='arr_pars.csv', arr_par_file=None, chunk_len=50)

a function to apply array-based multiplier parameters.

Parameters:

Name Type Description Default
arr_par `str` or `pandas.DataFrame`

if type str,

'arr_pars.csv'
chunk_len (`int`)

the number of files to process per chunk with multiprocessing - applies to both fac2real and process_ input_files. Default is 50.

required
Note

Used to implement the parameterization constructed by PstFromFlopyModel during a forward run

This function should be added to the forward_run.py script but can be called on any correctly formatted csv

This function using multiprocessing, spawning one process for each model input array (and optionally pp files). This speeds up execution time considerably but means you need to make sure your forward run script uses the proper multiprocessing idioms for freeze support and main thread handling (PstFrom does this for you).

apply_list_pars()

a function to apply boundary condition multiplier parameters.

Note

Used to implement the parameterization constructed by PstFromFlopyModel during a forward run

Requires either "temporal_list_pars.csv" or "spatial_list_pars.csv"

Should be added to the forward_run.py script (called programmaticlly by the PstFrom forward run script)

apply_temporal_diff_obs(config_file)

process an instruction-output file pair and formulate difference observations.

Parameters:

Name Type Description Default
config_file `str`

configuration file written by pyemu.helpers.setup_temporal_diff_obs.

required

Returns:

Type Description

diff_df (pandas.DataFrame) : processed difference observations

Note

Writes config_file.replace(".config",".processed") output file that can be read with the instruction file that is created by pyemu.helpers.setup_temporal_diff_obs().

This is the companion function of helpers.setup_setup_temporal_diff_obs().

geostatistical_draws(pst, struct_dict, num_reals=100, sigma_range=4, verbose=True, scale_offset=True, subset=None, rng=None)

construct a parameter ensemble from a prior covariance matrix implied by geostatistical structure(s) and parameter bounds.

Parameters:

Name Type Description Default
pst `pyemu.Pst`

a control file (or the name of control file). The parameter bounds in pst are used to define the variance of each parameter group.

required
struct_dict `dict`

a dict of GeoStruct (or structure file), and list of pilot point template files pairs. If the values in the dict are pd.DataFrames, then they must have an 'x','y', and 'parnme' column. If the filename ends in '.csv', then a pd.DataFrame is loaded, otherwise a pilot points file is loaded.

required
num_reals `int`

number of realizations to draw. Default is 100

100
sigma_range `float`

a float representing the number of standard deviations implied by parameter bounds. Default is 4.0, which implies 95% confidence parameter bounds.

4
verbose `bool`

flag to control output to stdout. Default is True. flag for stdout.

True
scale_offset `bool`,optional

flag to apply scale and offset to parameter bounds when calculating variances - this is passed through to pyemu.Cov.from_parameter_data(). Default is True.

True
subset `array-like`

list, array, set or pandas index defining subset of parameters for draw.

None
rng `numpy.random.RandomState`

random number generator if not using default from pyemu.en

None

Returns pyemu.ParameterEnsemble: the realized parameter ensemble.

Note

Parameters are realized by parameter group.

The variance of each parameter is used to scale the resulting geostatistical covariance matrix Therefore, the sill of the geostatistical structures in struct_dict should be 1.0

Example::

pst = pyemu.Pst("my.pst")
sd = {"struct.dat":["hkpp.dat.tpl","vka.dat.tpl"]}
pe = pyemu.helpers.geostatistical_draws(pst,struct_dict=sd}
pe.to_csv("my_pe.csv")

kl_setup(num_eig, sr, struct, prefixes, factors_file='kl_factors.dat', islog=True, basis_file=None, tpl_dir='.')

setup a karhuenen-Loeve based parameterization for a given geostatistical structure.

Parameters:

Name Type Description Default
num_eig `int`

the number of basis vectors to retain in the reduced basis

required
sr `flopy.reference.SpatialReference`

a spatial reference instance

required
struct `str`

a PEST-style structure file. Can also be a pyemu.geostats.Geostruct instance.

required
prefixes [`str`]

a list of parameter prefixes to generate KL parameterization for.

required
factors_file `str`

name of the PEST-style interpolation factors file to write (can be processed with FAC2REAL). Default is "kl_factors.dat".

'kl_factors.dat'
islog `bool`

flag to indicate if the parameters are log transformed. Default is True

True
basis_file `str`

the name of the PEST-style binary (e.g. jco) file to write the reduced basis vectors to. Default is None (not saved).

None
tpl_dir `str`

the directory to write the resulting template files to. Default is "." (current directory).

'.'

Returns:

Type Description

pandas.DataFrame: a dataframe of parameter information.

Note

This is the companion function to helpers.apply_kl()

Example::

m = flopy.modflow.Modflow.load("mymodel.nam")
prefixes = ["hk","vka","ss"]
df = pyemu.helpers.kl_setup(10,m.sr,"struct.dat",prefixes)

setup_temporal_diff_obs(pst, ins_file, out_file=None, include_zero_weight=False, include_path=False, sort_by_name=True, long_names=True, prefix='dif')

a helper function to setup difference-in-time observations based on an existing set of observations in an instruction file using the observation grouping in the control file

Parameters:

Name Type Description Default
pst `pyemu.Pst`

existing control file

required
ins_file `str`

an existing instruction file

required
out_file `str`

an existing model output file that corresponds to the instruction file. If None, ins_file.replace(".ins","") is used

None
include_zero_weight `bool`

flag to include zero-weighted observations in the difference observation process. Default is False so that only non-zero weighted observations are used.

False
include_path `bool`

flag to setup the binary file processing in directory where the hds_file is located (if different from where python is running). This is useful for setting up the process in separate directory for where python is running.

False
sort_by_name `bool`,optional

flag to sort observation names in each group prior to setting up the differencing. The order of the observations matters for the differencing. If False, then the control file order is used. If observation names have a datetime suffix, make sure the format is year-month-day to use this sorting. Default is True

True
long_names `bool`

flag to use long, descriptive names by concatenating the two observation names that are being differenced. This will produce names that are too long for traditional PEST(_HP). Default is True.

True
prefix `str`

prefix to prepend to observation names and group names. Default is "dif".

'dif'

Returns:

Type Description

tuple containing

  • str: the forward run command to execute the binary file process during model runs.
  • pandas.DataFrame: a dataframe of observation information for use in the pest control file
Note

This is the companion function of helpers.apply_temporal_diff_obs().

write_const_tpl(name, tpl_file, suffix, zn_array=None, shape=None, longnames=False)

write a constant (uniform) template file for a 2-D array

Parameters:

Name Type Description Default
name `str`

the base parameter name

required
tpl_file `str`

the template file to write

required
zn_array `numpy.ndarray`

an array used to skip inactive cells, and optionally get shape info.

None
shape `tuple`

tuple nrow and ncol. Either zn_array or shape must be passed

None
longnames `bool`

flag to use longer names that exceed 12 chars in length. Default is False.

False

Returns:

Type Description

pandas.DataFrame: a dataframe with parameter information

Note

This function is used during the PstFrom setup process

write_grid_tpl(name, tpl_file, suffix, zn_array=None, shape=None, spatial_reference=None, longnames=False)

write a grid-based template file for a 2-D array

Parameters:

Name Type Description Default
name `str`

the base parameter name

required
tpl_file `str`

the template file to write - include path

required
zn_array `numpy.ndarray`

zone array to identify inactive cells. Default is None

None
shape `tuple`

a length-two tuple of nrow and ncol. Either zn_array or shape must be passed.

None
spatial_reference `flopy.utils.SpatialReference`

a spatial reference instance. If longnames is True, then spatial_reference is used to add spatial info to the parameter names.

None
longnames `bool`

flag to use longer names that exceed 12 chars in length. Default is False.

False

Returns:

Type Description

pandas.DataFrame: a dataframe with parameter information

Note

This function is used during the PstFrom setup process

Example::

pyemu.helpers.write_grid_tpl("hk_layer1","hk_Layer_1.ref.tpl","gr",
                             zn_array=ib_layer_1,shape=(500,500))

write_zone_tpl(name, tpl_file, suffix='', zn_array=None, shape=None, longnames=False, fill_value='1.0')

write a zone-based template file for a 2-D array

Parameters:

Name Type Description Default
name `str`

the base parameter name

required
tpl_file `str`

the template file to write

required
suffix `str`

suffix to add to parameter names. Only used if longnames=True

''
zn_array `numpy.ndarray`

an array used to skip inactive cells, and optionally get shape info. zn_array values less than 1 are given fill_value

None
shape `tuple`

tuple nrow and ncol. Either zn_array or shape must be passed

None
longnames `bool`

flag to use longer names that exceed 12 chars in length. Default is False.

False
fill_value `str`

value to fill locations where zn_array is zero or less. Default is "1.0".

'1.0'

Returns:

Type Description

pandas.DataFrame: a dataframe with parameter information

Note

This function is used during the PstFrom setup process