utils
pyemu utils module contains lots of useful functions and classes, including support for geostatistical interpolation and covariance matrices, pilot point setup and processing and functionality dedicated to wrapping MODFLOW models into the PEST(++) model independent framework
Cov
Bases: Matrix
Diagonal and/or dense Covariance matrices
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
`numpy.ndarray`
|
numeric values |
None
|
names
|
[`str`]
|
list of row and column names |
[]
|
isdigonal
|
`bool`
|
flag if the Matrix is diagonal |
required |
autoalign
|
`bool`
|
flag to control the autoalignment of Matrix during linear algebra operations |
True
|
Example::
data = pyemu.en.rng.random((10,10))
names = ["par_{0}".format(i) for i in range(10)]
mat = pyemu.Cov(x=data,names=names)
mat.to_binary("mat.jco")
Note
row_names and col_names args are supported in the constructor
so support inheritance. However, users should only pass names
T
property
wrapper function for Matrix.transpose() method
Returns:
| Type | Description |
|---|---|
|
|
Note
returns a copy of self
A syntatic-sugar overload of Matrix.transpose()
Example::
jcb = pyemu.Jco.from_binary("pest.jcb")
jcbt = jcb.T
as_2d
property
get a 2D numeric representation of Matrix.x. If not isdiagonal, simply
return reference to Matrix.x, otherwise, constructs and returns
a 2D, diagonal ndarray
Returns:
| Type | Description |
|---|---|
|
|
Example::
# A diagonal cov
cov = pyemu.Cov.from_parameter_data
x2d = cov.as_2d # a numpy 2d array
print(cov.shape,cov.x.shape,x2d.shape)
full_s
property
Get the full singular value matrix
Returns:
| Type | Description |
|---|---|
|
|
|
|
with zeros along the diagonal from |
Example::
jco = pyemu.Jco.from_binary("pest.jcb")
jco.full_s.to_ascii("full_sing_val_matrix.mat")
identity
property
get an identity Cov of the same shape
Returns:
| Type | Description |
|---|---|
|
|
Note
the returned identity matrix has the same row-col names as self
inv
property
inversion operation of Matrix
Returns:
| Type | Description |
|---|---|
|
|
Note
uses numpy.linalg.inv for the inversion
Example::
mat = pyemu.Matrix.from_binary("my.jco")
mat_inv = mat.inv
mat_inv.to_binary("my_inv.jco")
names
property
wrapper for getting row_names. row_names == col_names for Cov
Returns:
| Type | Description |
|---|---|
|
[ |
ncol
property
length of second dimension
Returns:
| Type | Description |
|---|---|
|
|
newx
property
return a copy of Matrix.x attribute
Returns:
| Type | Description |
|---|---|
|
|
nrow
property
length of first dimension
Returns:
| Type | Description |
|---|---|
|
|
s
property
the singular value (diagonal) Matrix
Returns:
| Type | Description |
|---|---|
|
|
Example::
# plot the singular spectrum of the jacobian
import matplotlib.pyplot as plt
jco = pyemu.Jco.from_binary("pest.jcb")
plt.plot(jco.s.x)
plt.show()
shape
property
get the implied, 2D shape of Matrix
Returns:
| Type | Description |
|---|---|
|
|
Example::
jco = pyemu.Jco.from_binary("pest.jcb")
shape = jco.shape
nrow,ncol = shape #unpack to ints
sqrt
property
element-wise square root operation
Returns:
| Type | Description |
|---|---|
|
|
Note
uses numpy.sqrt
Example::
cov = pyemu.Cov.from_uncfile("my.unc")
sqcov = cov.sqrt
sqcov.to_uncfile("sqrt_cov.unc")
transpose
property
transpose operation of self
Returns:
| Type | Description |
|---|---|
|
|
Example::
jcb = pyemu.Jco.from_binary("pest.jcb")
jcbt = jcb.T
u
property
the left singular vector Matrix
Returns:
| Type | Description |
|---|---|
|
|
Example::
jco = pyemu.Jco.from_binary("pest.jcb")
jco.u.to_binary("u.jcb")
v
property
the right singular vector Matrix
Returns:
| Type | Description |
|---|---|
|
|
Example::
jco = pyemu.Jco.from_binary("pest.jcb")
jco.v.to_binary("v.jcb")
x
property
return a reference to Matrix.x
Returns:
| Type | Description |
|---|---|
|
|
zero
property
get a diagonal instance of Cov with all zeros on the diagonal
Returns:
| Type | Description |
|---|---|
|
|
zero2d
property
get an 2D instance of self with all zeros
Returns:
| Type | Description |
|---|---|
|
|
Example::
jco = pyemu.Jco.from_binary("pest.jcb")
zero_jco = jco.zero2d
__add__(other)
Overload of numpy.ndarray.add() - elementwise addition. Tries to speedup by checking for scalars of diagonal matrices on either side of operator
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
( |
required |
Returns:
| Type | Description |
|---|---|
|
|
Note
if Matrix and other (if applicable) have autoalign set to True,
both Matrix and other are aligned based on row and column names.
If names are not common between the two, this may result in a smaller
returned Matrix. If no names are shared, an exception is raised.
The addition of a scalar does not expand non-zero elements.
Example::
m1 = pyemu.Cov.from_parameter_data(pst)
m1_plus_on1 = m1 + 1.0 #add 1.0 to all non-zero elements
m2 = m1.copy()
m2_plus_m1 = m1 + m2
__getitem__(item)
a very crude overload of object.getitem().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
`object`
|
something that can be used as an index |
required |
Returns:
| Type | Description |
|---|---|
|
|
__mul__(other)
Dot product multiplication overload. Tries to speedup by checking for scalars or diagonal matrices on either side of operator
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
( |
required |
Returns:
| Type | Description |
|---|---|
|
|
Note
if Matrix and other (if applicable) have autoalign set to True,
both Matrix and other are aligned based on row and column names.
If names are not common between the two, this may result in a smaller
returned Matrix. If not common elements are found, an exception is raised
Example::
jco = pyemu.Jco.from_binary("pest.jcb")
cov = pyemu.Cov.from_parmaeter_data(pst)
# get the forecast prior covariance matrix
forecast_cov = jco.get(pst.forecast_names).T * cov * jco.get(pst.forecast_names)
__pow__(power)
overload of numpy.ndarray.pow() operator
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
power
|
`float`
|
interpreted as follows: -1 = inverse of self, -0.5 = sqrt of inverse of self, 0.5 = sqrt of self. All other positive ints = elementwise self raised to power |
required |
Returns:
| Type | Description |
|---|---|
|
|
Example::
cov = pyemu.Cov.from_uncfile("my.unc")
sqcov = cov**2
__rmul__(other)
Reverse order Dot product multiplication overload.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
( |
required |
Returns:
| Type | Description |
|---|---|
|
|
Note
if Matrix and other (if applicable) have autoalign set to True,
both Matrix and other are aligned based on row and column names.
If names are not common between the two, this may result in a smaller
returned Matrix. If not common elements are found, an exception is raised
Example::
# multiply by a scalar
jco = pyemu.Jco.from_binary("pest.jcb")
jco_times_10 = 10 * jco
__set_svd()
private method to set SVD components.
Note: this should not be called directly
__str__()
overload of object.str()
Returns:
| Type | Description |
|---|---|
|
|
__sub__(other)
numpy.ndarray.sub() overload. Tries to speedup by checking for scalars of diagonal matrices on either side of operator
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
( |
required |
Returns:
| Type | Description |
|---|---|
|
|
Note
if Matrix and other (if applicable) have autoalign set to True,
both Matrix and other are aligned based on row and column names.
If names are not common between the two, this may result in a smaller
returned Matrix. If no names are shared, an exception is raised
Example::
jco1 = pyemu.Jco.from_binary("pest.1.jcb")
jco2 = pyemu.Jco.from_binary("pest.2.jcb")
diff = jco1 - jco2
align(names, axis=None)
reorder Matrix by names in place. If axis is None, reorder both indices
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
names
|
[str]
|
names in |
required |
axis
|
`int`
|
the axis to reorder. if None, reorder both axes |
None
|
Note
Works in place. Is called programmatically during linear algebra operations
Example::
# load a jco that has more obs (rows) than non-zero weighted obs
# in the control file
jco = pyemu.Jco.from_binary("pest.jcb")
# get an obs noise cov matrix
obscov = pyemu.Cov.from_observation_data(pst)
jco.align(obscov.row_names,axis=0)
condition_on(conditioning_elements)
get a new Covariance object that is conditional on knowing some elements. uses Schur's complement for conditional Covariance propagation
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conditioning_elements
|
[str]
|
list of names of elements to condition on |
required |
Returns:
| Type | Description |
|---|---|
|
|
Example::
prior_cov = pyemu.Cov.from_parameter_data(pst)
now_known_pars = pst.adj_par_names[:5]
post_cov = prior_cov.condition_on(now_known_pars)
copy()
get a copy of Matrix
Returns:
| Type | Description |
|---|---|
|
|
df()
wrapper of Matrix.to_dataframe()
drop(names, axis)
drop elements from Matrix in place
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
names
|
[str]
|
list of names to drop |
required |
axis
|
`int`
|
the axis to drop from. must be in [0,1] |
required |
element_isaligned(other)
check if matrices are aligned for element-wise operations
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
`Matrix`
|
the other matrix to check for alignment with |
required |
Returns:
| Type | Description |
|---|---|
|
|
extend(other)
extend Matrix with the elements of other, returning a new matrix.
Args:
other (Matrix): the Matrix to extend self by
Returns:
| Type | Description |
|---|---|
|
|
Note
No row or column names can be shared between self and other
Example::
jco1 = pyemu.Jco.from_binary("pest_history.jco")
jco2 = pyemu.Jco.from_binary("pest_forecast.jco")
jco_ext = jco1.extend(jco2)
extract(row_names=None, col_names=None)
wrapper method that Matrix.gets() then Matrix.drops() elements.
one of row_names or col_names must be not None.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
row_names
|
[str]
|
row_names to extract. If |
None
|
col_names
|
[str]
|
col_names to extract. If |
None
|
Returns:
| Type | Description |
|---|---|
|
|
Example::
cov = pyemu.Cov.from_parameter_data(pst)
hk_cov = cov.extract(row_names=["hk1","hk2","hk3"])
find_rowcol_indices(names, row_names, col_names, axis=None)
staticmethod
fast(er) look of row and column names indices
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
names
|
[`str`]
|
list of names to look for in |
required |
row_names
|
[`str`]
|
list of row names |
required |
col_names
|
[`str`]
|
list of column names |
required |
axis
|
`int`
|
axis to search along. If None, search both.
Default is |
None
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
|
from_ascii(filename)
classmethod
load a PEST-compatible ASCII matrix/vector file into a
Matrix instance
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
name of the file to read |
required |
Returns:
| Type | Description |
|---|---|
|
|
Example::
mat = pyemu.Matrix.from_ascii("my.mat")
cov = pyemi.Cov.from_ascii("my.cov")
from_binary(filename, forgive=False)
classmethod
class method load from PEST-compatible binary file into a Matrix instance
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
filename to read |
required |
forgive
|
`bool`
|
flag to forgive incomplete data records. Only
applicable to dense binary format. Default is |
False
|
Returns:
| Type | Description |
|---|---|
|
|
Example::
mat = pyemu.Matrix.from_binary("my.jco")
cov = pyemu.Cov.from_binary("large_cov.jcb")
from_dataframe(df)
classmethod
class method to create a new Matrix instance from a
pandas.DataFrame
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
`pandas.DataFrame`
|
dataframe |
required |
Returns:
| Type | Description |
|---|---|
|
|
Example::
df = pd.read_csv("my.csv")
mat = pyemu.Matrix.from_dataframe(df)
from_fortranfile(filename)
staticmethod
a binary load method to accommodate one of the many bizarre fortran binary writing formats
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
name of the binary matrix file |
required |
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
|
|
from_names(row_names, col_names, isdiagonal=False, autoalign=True, random=False)
classmethod
class method to create a new Matrix instance from row names and column names, filled with trash
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
row_names
|
[str]
|
row names for the new |
required |
col_names
|
[str]
|
col_names for the new matrix |
required |
isdiagonal
|
`bool`
|
flag for diagonal matrix. Default is False |
False
|
autoalign
|
`bool`
|
flag for autoaligning new matrix during linear algebra calcs. Default is True |
True
|
random
|
`bool`
|
flag for contents of the trash matrix. If True, fill with random numbers, if False, fill with zeros Default is False |
False
|
Returns:
| Type | Description |
|---|---|
|
|
Example::
row_names = ["row_1","row_2"]
col_names = ["col_1,"col_2"]
m = pyemu.Matrix.from_names(row_names,col_names)
from_observation_data(pst)
classmethod
instantiates a Cov from pyemu.Pst.observation_data
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst
|
`pyemu.Pst`
|
control file instance |
required |
Returns:
| Type | Description |
|---|---|
|
|
|
|
weights in the pest control file. Zero-weighted observations |
|
|
are included with a weight of 1.0e-30 |
Example::
obscov = pyemu.Cov.from_observation_data(pst)
from_obsweights(pst_file)
classmethod
instantiates a Cov instance from observation weights in
a PEST control file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst_file
|
`str`
|
pest control file name |
required |
Returns:
| Type | Description |
|---|---|
|
|
|
|
weights in the pest control file. Zero-weighted observations |
|
|
are included with a weight of 1.0e-30 |
Note
Calls Cov.from_observation_data()
Example::
obscov = pyemu.Cov.from_obsweights("my.pst")
from_parameter_data(pst, sigma_range=4.0, scale_offset=True, subset=None)
classmethod
Instantiates a Cov from a pest control file parameter data section using
parameter bounds as a proxy for uncertainty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst_file
|
`str`
|
pest control file name |
required |
sigma_range
|
`float`
|
defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution |
4.0
|
scale_offset
|
`bool`
|
flag to apply scale and offset to parameter upper and lower bounds before calculating variance. In some cases, not applying scale and offset can result in undefined (log) variance. Default is True. |
True
|
subset
|
`list`-like
|
Subset of parameters to draw |
None
|
Returns:
| Type | Description |
|---|---|
|
|
Note
Calls Cov.from_parameter_data()
from_parbounds(pst_file, sigma_range=4.0, scale_offset=True)
classmethod
Instantiates a Cov from a pest control file parameter data section using
parameter bounds as a proxy for uncertainty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst_file
|
`str`
|
pest control file name |
required |
sigma_range
|
`float`
|
defines range of upper bound - lower bound in terms of standard deviation (sigma). For example, if sigma_range = 4, the bounds represent 4 * sigma. Default is 4.0, representing approximately 95% confidence of implied normal distribution |
4.0
|
scale_offset
|
`bool`
|
flag to apply scale and offset to parameter upper and lower bounds before calculating variance. In some cases, not applying scale and offset can result in undefined (log) variance. Default is True. |
True
|
Returns:
| Type | Description |
|---|---|
|
|
Note
Calls Cov.from_parameter_data()
from_uncfile(filename, pst=None)
classmethod
instaniates a Cov from a PEST-compatible uncertainty file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
uncertainty file name |
required |
pst
|
'pyemu.Pst`
|
a control file instance. this is needed if "first_parameter" and "last_parameter" keywords. Default is None |
None
|
Returns:
| Type | Description |
|---|---|
|
|
Example::
cov = pyemu.Cov.from_uncfile("my.unc")
get(row_names=None, col_names=None, drop=False)
get a new Matrix instance ordered on row_names or col_names
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
row_names
|
[str]
|
row_names for new Matrix. If |
None
|
col_names
|
[str]
|
col_names for new Matrix. If |
None
|
drop
|
`bool`
|
flag to remove row_names and/or col_names from this |
False
|
Returns:
| Type | Description |
|---|---|
|
|
Example:: # load a jco that has more obs (rows) than non-zero weighted obs # in the control file jco = pyemu.Jco.from_binary("pest.jcb") # get an obs noise cov matrix obscov = pyemu.Cov.from_observation_data(pst) nnz_jco = jco.get(row_names = obscov.row_names)
get_dense_binary_info(filename)
staticmethod
read the header and row and offsets for a dense binary file.
Parameters
filename (`str`): dense binary filename
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
|
|
|
|
get_diagonal_vector(col_name='diag')
Get a new Matrix instance that is the diagonal of self. The shape of the new matrix is (self.shape[0],1). Self must be square
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col_name
|
`str`
|
the name of the single column in the new Matrix |
'diag'
|
Returns:
| Type | Description |
|---|---|
|
|
Example::
cov = pyemu.Cov.from_unc_file("my.unc")
cov_diag = cov.get_diagonal_vector()
print(cov_diag.col_names)
get_maxsing(eigthresh=1e-05)
Get the number of singular components with a singular value ratio greater than or equal to eigthresh
Args:
eigthresh (float): the ratio of smallest to largest
singular value to retain. Since it is assumed that
s is sorted from largest to smallest, once a singular value
is reached that yields a ratio with the first (largest)
singular value, the index of this singular is returned.
Returns:
| Type | Description |
|---|---|
|
|
|
|
first singular value is less than or equal to |
Note
this method calls the static method Matrix.get_maxsing_from_s()
with Matrix.s.x
Example::
jco = pyemu.Jco.from_binary("pest.jco")
max_sing = jco.get_maxsing(eigthresh=pst.svd_data.eigthresh)
get_maxsing_from_s(s, eigthresh=1e-05)
staticmethod
static method to work out the maxsing for a given singular spectrum
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
s
|
`numpy.ndarray`
|
1-D array of singular values. This
array should come from calling either |
required |
eigthresh
|
`float`
|
the ratio of smallest to largest
singular value to retain. Since it is assumed that
|
1e-05
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
first singular value is less than or equal to |
Example::
jco = pyemu.Jco.from_binary("pest.jco")
max_sing = pyemu.Matrix.get_maxsing_from_s(jco.s,eigthresh=pst.svd_data.eigthresh)
hadamard_product(other)
Overload of numpy.ndarray.mult(): element-wise multiplication. Tries to speedup by checking for scalars of diagonal matrices on either side of operator
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
( |
required |
Returns:
| Type | Description |
|---|---|
|
|
Note
if Matrix and other (if applicable) have autoalign set to True,
both Matrix and other are aligned based on row and column names.
If names are not common between the two, this may result in a smaller
returned Matrix. If not common elements are shared, an exception is raised
Example::
cov = pyemu.Cov.from_parameter_data(pst)
cov2 = cov * 10
cov3 = cov * cov2
identity_like(other)
classmethod
Get an identity matrix Cov instance like other Cov
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
`Matrix`
|
other matrix - must be square |
required |
Returns:
| Type | Description |
|---|---|
|
|
Note
the returned identity cov matrix is treated as non-diagonal
indices(names, axis=None)
get the row and col indices of names. If axis is None, two ndarrays are returned, corresponding the indices of names for each axis
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
names
|
[`str`]
|
list of names to look for in |
required |
row_names
|
[`str`]
|
list of row names |
required |
col_names
|
[`str`]
|
list of column names |
required |
axis
|
`int`
|
axis to search along. If None, search both.
Default is |
None
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
|
Note
thin wrapper around Matrix.find_rowcol_indices static method
mult_isaligned(other)
check if matrices are aligned for dot product multiplication
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
`Matrix`
|
the other matrix to check for alignment with |
required |
Returns:
| Type | Description |
|---|---|
|
|
pseudo_inv(maxsing=None, eigthresh=1e-05)
The pseudo inverse of self. Formed using truncated singular
value decomposition and Matrix.pseudo_inv_components
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
maxsing
|
`int`
|
the number of singular components to use. If None,
|
None
|
`eigthresh`
|
( |
required |
Returns:
| Type | Description |
|---|---|
|
|
Example::
jco = pyemu.Jco.from_binary("pest.jcb")
jco_psinv = jco.pseudo_inv(pst.svd_data.maxsing,pst.svd_data.eigthresh)
jco_psinv.to_binary("pseudo_inv.jcb")
pseudo_inv_components(maxsing=None, eigthresh=1e-05, truncate=True)
Get the (optionally) truncated SVD components
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
maxsing
|
`int`
|
the number of singular components to use. If None,
|
None
|
`eigthresh`
|
( |
required | |
truncate
|
`bool`
|
flag to truncate components. If False, U, s, and V will be
zeroed out at locations greater than |
True
|
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
|
|
Example::
mat = pyemu.Matrix.from_binary("my.jco")
u1,s1,v1 = mat.pseudo_inv_components(maxsing=10)
resolution_matrix = v1 * v1.T
resolution_matrix.to_ascii("resol.mat")
read_ascii(filename)
staticmethod
read a PEST-compatible ASCII matrix/vector file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
file to read from |
required |
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
|
|
|
|
read_binary(filename, forgive=False)
staticmethod
static method to read PEST-format binary files
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
filename to read |
required |
forgive
|
`bool`
|
flag to forgive incomplete data records. Only
applicable to dense binary format. Default is |
False
|
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
|
|
read_binary_header(filename)
staticmethod
read the first elements of a PEST(++)-style binary file to get format and dimensioning information.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
the filename of the binary file |
required |
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
|
|
read_dense(filename, forgive=False, close=True, only_rows=None)
staticmethod
read a dense-format binary file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
the filename or the open filehandle |
required |
forgive
|
`bool`
|
flag to forgive incomplete records. If True and an incomplete record is encountered, only the previously read records are returned. If False, an exception is raised for an incomplete record |
False
|
close
|
`bool`
|
flag to close the filehandle. Default is True |
True
|
only_rows
|
`iterable`
|
rows to read. If None, all rows are read |
None
|
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
|
|
replace(other)
replace elements in the covariance matrix with elements from other.
if other is not diagonal, then this Cov becomes non diagonal
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
`Cov`
|
the Cov to replace elements in this |
required |
Note
operates in place. Other must have the same row-col names as self
reset_x(x, copy=True)
reset self.__x private attribute
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
`numpy.ndarray`
|
the new numeric data |
required |
copy
|
`bool`
|
flag to make a copy of 'x'. Default is True |
True
|
Returns:
| Type | Description |
|---|---|
|
None |
Note
operates in place
to_2d()
get a 2D Matrix representation of Matrix. If not Matrix.isdiagonal, simply
return a copy of Matrix, otherwise, constructs and returns a new Matrix
instance that is stored as diagonal
Returns:
| Type | Description |
|---|---|
|
|
Example::
# A diagonal cov
cov = pyemu.Cov.from_parameter_data
cov2d = cov.as_2d # a numpy 2d array
print(cov.shape,cov.x.shape,cov2d.shape,cov2d.x.shape)
to_ascii(filename, icode=2)
write a PEST-compatible ASCII Matrix/vector file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
filename to write to |
required |
icode
|
`int`
|
PEST-style info code for matrix style. Default is 2. |
2
|
Note:
if icode == -1, a 1-d vector is written that represents a diagonal matrix. An
exception is raised if icode == -1 and isdiagonal is False
to_binary(filename, droptol=None, chunk=None)
write a PEST-compatible binary file. The format is the same as the format used to storage a PEST Jacobian matrix
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
filename to save binary file |
required |
droptol
|
`float`
|
absolute value tolerance to make values
smaller |
None
|
chunk
|
`int`
|
number of elements to write in a single pass.
Default is |
None
|
to_coo(filename, droptol=None, chunk=None)
write an extended PEST-format binary file. The data format is
[int,int,float] for i,j,value. It is autodetected during
the read with Matrix.from_binary().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str` or `Path`
|
filename to save binary file |
required |
droptol
|
`float`
|
absolute value tolerance to make values
smaller |
None
|
chunk
|
`int`
|
number of elements to write in a single pass.
Default is |
None
|
Note
This method is needed when the number of dimensions times 2 is larger than the max value for a 32-bit integer. happens! This method is used by pyemu.Ensemble.to_binary()
to_dataframe()
return a pandas.DataFrame representation of Matrix
Returns:
| Type | Description |
|---|---|
|
|
Note
if self.isdiagonal is True, the full matrix is used to fill
the dataframe - lots of zeros.
to_dense(filename, close=True)
experimental new dense matrix storage format to support faster I/O with ensembles
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
the filename to save to |
required |
close
|
`bool`
|
flag to close the filehandle after saving |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
f |
`file`
|
the file handle. Only returned if |
Note
calls Matrix.write_dense()
to_pearson()
Convert Cov instance to Pearson correlation coefficient matrix
Returns:
| Type | Description |
|---|---|
|
|
|
|
on purpose so that it is clear the returned instance is not a Cov |
Example::
# plot the posterior parameter correlation matrix
import matplotlib.pyplot as plt
cov = pyemu.Cov.from_ascii("pest.post.cov")
cc = cov.to_pearson()
cc.x[cc.x==1.0] = np.nan
plt.imshow(cc)
to_uncfile(unc_file, covmat_file='cov.mat', var_mult=1.0, include_path=False)
write a PEST-compatible uncertainty file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
unc_file
|
`str`
|
filename of the uncertainty file |
required |
covmat_file
|
`str`
|
covariance matrix filename. Default is
"Cov.mat". If None, and Cov.isdiaonal, then a standard deviation
form of the uncertainty file is written. Exception raised if |
'cov.mat'
|
var_mult
|
`float`
|
variance multiplier for the covmat_file entry |
1.0
|
include_path
|
`bool`
|
flag to include the path of |
False
|
Example::
cov = pyemu.Cov.from_parameter_data(pst)
cov.to_uncfile("my.unc")
write_dense(filename, row_names, col_names, data, close=False)
staticmethod
experimental new dense matrix storage format to support faster I/O with ensembles
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str` or `file`
|
the file to write to |
required |
row_names
|
[`str`]
|
row names of the matrix |
required |
col_names
|
[`str`]
|
col names of the matrix |
required |
data
|
`np.ndarray`
|
matrix elements |
required |
close
|
`bool`
|
flag to close the file after writing |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
f |
`file`
|
the file handle. Only returned if |
ExpVario
Bases: Vario2d
Exponential variogram derived type
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contribution
|
float
|
sill of the variogram |
required |
a
|
`float`
|
(practical) range of correlation |
required |
anisotropy
|
`float`
|
Anisotropy ratio. Default is 1.0 |
1.0
|
bearing
|
( |
required | |
name
|
`str`, optinoal
|
name of the variogram. Default is "var1" |
'var1'
|
Example::
v = pyemu.utils.geostats.ExpVario(a=1000,contribution=1.0)
bearing_rads
property
get the bearing of the Vario2d in radians
Returns:
| Type | Description |
|---|---|
|
|
rotation_coefs
property
get the rotation coefficients in radians
Returns:
| Type | Description |
|---|---|
|
[ |
__str__()
get the str representation of Vario2d
Returns:
| Type | Description |
|---|---|
|
|
covariance(pt0, pt1)
get the covariance between two points implied by Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pt0
|
([ |
required | |
pt1
|
([ |
required |
Returns:
| Type | Description |
|---|---|
|
|
covariance_matrix(x, y, names=None, cov=None)
build a pyemu.Cov instance implied by Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
[`float`]
|
x-coordinate locations |
required |
y
|
[`float`]
|
y-coordinate locations |
required |
names
|
[`str`]
|
names of locations. If None, cov must not be None |
None
|
cov
|
`pyemu.Cov`
|
an existing Cov instance. Vario2d contribution is added to cov |
None
|
Returns:
| Type | Description |
|---|---|
|
|
Note
either names or cov must not be None.
covariance_points(x0, y0, xother, yother)
get the covariance between base point (x0,y0) and
other points xother,yother implied by Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x0
|
`float`
|
x-coordinate |
required |
y0
|
`float`
|
y-coordinate |
required |
xother
|
[`float`]
|
x-coordinates of other points |
required |
yother
|
[`float`]
|
y-coordinates of other points |
required |
Returns:
| Type | Description |
|---|---|
|
|
|
|
points contained in xother, yother. len(cov) = len(xother) = |
|
|
len(yother) |
inv_h(h)
the inverse of the h_function. Used for plotting
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
h
|
`float`
|
the value of h_function to invert |
required |
Returns:
| Type | Description |
|---|---|
|
|
plot(**kwargs)
get a cheap plot of the Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
`dict`
|
keyword arguments to use for plotting |
{}
|
Returns:
| Type | Description |
|---|---|
|
|
Note
optional arguments in kwargs include
"ax" (existing matplotlib.pyplot.axis). Other
kwargs are passed to matplotlib.pyplot.plot()
to_struct_file(f)
write the Vario2d to a PEST-style structure file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
f
|
`str`
|
filename to write to. |
required |
GauVario
Bases: Vario2d
Gaussian variogram derived type
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contribution
|
float
|
sill of the variogram |
required |
a
|
`float`
|
(practical) range of correlation |
required |
anisotropy
|
`float`
|
Anisotropy ratio. Default is 1.0 |
1.0
|
bearing
|
( |
required | |
name
|
`str`, optinoal
|
name of the variogram. Default is "var1" |
'var1'
|
Example::
v = pyemu.utils.geostats.GauVario(a=1000,contribution=1.0)
Note
the Gaussian variogram can be unstable (not invertible) for long ranges.
bearing_rads
property
get the bearing of the Vario2d in radians
Returns:
| Type | Description |
|---|---|
|
|
rotation_coefs
property
get the rotation coefficients in radians
Returns:
| Type | Description |
|---|---|
|
[ |
__str__()
get the str representation of Vario2d
Returns:
| Type | Description |
|---|---|
|
|
covariance(pt0, pt1)
get the covariance between two points implied by Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pt0
|
([ |
required | |
pt1
|
([ |
required |
Returns:
| Type | Description |
|---|---|
|
|
covariance_matrix(x, y, names=None, cov=None)
build a pyemu.Cov instance implied by Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
[`float`]
|
x-coordinate locations |
required |
y
|
[`float`]
|
y-coordinate locations |
required |
names
|
[`str`]
|
names of locations. If None, cov must not be None |
None
|
cov
|
`pyemu.Cov`
|
an existing Cov instance. Vario2d contribution is added to cov |
None
|
Returns:
| Type | Description |
|---|---|
|
|
Note
either names or cov must not be None.
covariance_points(x0, y0, xother, yother)
get the covariance between base point (x0,y0) and
other points xother,yother implied by Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x0
|
`float`
|
x-coordinate |
required |
y0
|
`float`
|
y-coordinate |
required |
xother
|
[`float`]
|
x-coordinates of other points |
required |
yother
|
[`float`]
|
y-coordinates of other points |
required |
Returns:
| Type | Description |
|---|---|
|
|
|
|
points contained in xother, yother. len(cov) = len(xother) = |
|
|
len(yother) |
inv_h(h)
the inverse of the h_function. Used for plotting
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
h
|
`float`
|
the value of h_function to invert |
required |
Returns:
| Type | Description |
|---|---|
|
|
plot(**kwargs)
get a cheap plot of the Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
`dict`
|
keyword arguments to use for plotting |
{}
|
Returns:
| Type | Description |
|---|---|
|
|
Note
optional arguments in kwargs include
"ax" (existing matplotlib.pyplot.axis). Other
kwargs are passed to matplotlib.pyplot.plot()
to_struct_file(f)
write the Vario2d to a PEST-style structure file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
f
|
`str`
|
filename to write to. |
required |
GeoStruct
Bases: object
a geostatistical structure object that mimics the behavior of a PEST geostatistical structure. The object contains variogram instances and (optionally) nugget information.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
nugget
|
`float` (optional)
|
nugget contribution. Default is 0.0 |
0.0
|
variograms
|
([ |
required | |
name
|
`str` (optional)
|
name to assign the structure. Default is "struct1". |
'struct1'
|
transform
|
`str` (optional)
|
the transformation to apply to
the GeoStruct. Can be "none" or "log", depending on the
transformation of the property being represented by the |
'none'
|
Example::
v = pyemu.utils.geostats.ExpVario(a=1000,contribution=1.0)
gs = pyemu.utils.geostats.GeoStruct(variograms=v,nugget=0.5)
gs.plot()
# get a covariance matrix implied by the geostruct for three points
px = [0,1000,2000]
py = [0,0,0]
pnames ["p1","p2","p3"]
cov = gs.covariance_matrix(px,py,names=pnames)
nugget = float(nugget)
instance-attribute
float: the nugget effect contribution
sill
property
get the sill of the GeoStruct
Returns:
| Type | Description |
|---|---|
|
|
|
|
nugget and contribution from each variogram |
transform = transform
instance-attribute
str: the transformation of the GeoStruct. Can be 'log' or 'none'
variograms = variograms
instance-attribute
[pyemu.utils.geostats.Vario2d]: a list of variogram instances
__str__()
the str representation of the GeoStruct
Returns:
| Type | Description |
|---|---|
|
|
covariance(pt0, pt1)
get the covariance between two points implied by the GeoStruct.
This is used during the ordinary kriging process to get the RHS
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pt0
|
[`float`]
|
xy-pair |
required |
pt1
|
[`float`]
|
xy-pair |
required |
Returns:
| Type | Description |
|---|---|
|
|
|
|
by the GeoStruct |
Example::
p1 = [0,0]
p2 = [1,1]
v = pyemu.geostats.ExpVario(a=0.1,contribution=1.0)
gs = pyemu.geostats.Geostruct(variograms=v)
c = gs.covariance(p1,p2)
covariance_matrix(x, y, names=None, cov=None)
build a pyemu.Cov instance from GeoStruct
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
[`floats`]
|
x-coordinate locations |
required |
y
|
[`float`]
|
y-coordinate locations |
required |
names
|
[`str`] (optional)
|
names of location. If None, cov must not be None. Default is None. |
None
|
cov
|
`pyemu.Cov`
|
an existing Cov instance. The contribution of this GeoStruct is added to cov. If cov is None, names must not be None. Default is None |
None
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
GeoStruct for the x,y pairs. |
|
|
names supplied by the names argument unless the "cov" |
|
|
argument was passed. |
Note
either "names" or "cov" must be passed. If "cov" is passed, cov.shape must equal len(x) and len(y).
Example::
pp_df = pyemu.pp_utils.pp_file_to_dataframe("hkpp.dat")
cov = gs.covariance_matrix(pp_df.x,pp_df.y,pp_df.name)
cov.to_binary("cov.jcb")
covariance_points(x0, y0, xother, yother)
Get the covariance between point (x0,y0) and the points contained in xother, yother.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x0
|
`float`
|
x-coordinate |
required |
y0
|
`float`
|
y-coordinate |
required |
xother
|
[`float`]
|
x-coordinates of other points |
required |
yother
|
[`float`]
|
y-coordinates of other points |
required |
Returns:
| Type | Description |
|---|---|
|
|
|
|
points contained in xother, yother. len(cov) = len(xother) = |
|
|
len(yother) |
Example::
x0,y0 = 1,1
xother = [2,3,4,5]
yother = [2,3,4,5]
v = pyemu.geostats.ExpVario(a=0.1,contribution=1.0)
gs = pyemu.geostats.Geostruct(variograms=v)
c = gs.covariance_points(x0,y0,xother,yother)
plot(**kwargs)
make a cheap plot of the GeoStruct
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
(dict) keyword arguments to use for plotting. |
required |
Returns:
matplotlib.pyplot.axis: the axis with the GeoStruct plot
Note
optional arguments include "ax" (an existing axis), "individuals" (plot each variogram on a separate axis), "legend" (add a legend to the plot(s)). All other kwargs are passed to matplotlib.pyplot.plot()
Example::
v = pyemu.geostats.ExpVario(a=0.1,contribution=1.0)
gs = pyemu.geostats.Geostruct(variograms=v)
gs.plot()
same_as_other(other)
compared to geostructs for similar attributes
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
`pyemu.geostats.Geostruct`
|
the other one |
required |
Returns:
| Name | Type | Description |
|---|---|---|
same |
`bool`
|
True is the |
to_struct_file(f)
write a PEST-style structure file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
f
|
`str`
|
file to write the GeoStruct information in to. Can also be an open file handle |
required |
GsfReader
a helper class to read a standard modflow-usg gsf file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gsffilename
|
`str`
|
filename |
required |
get_node_coordinates(zcoord=False, zero_based=False)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
zcoord
|
`bool`
|
flag to add z coord to coordinates. Default is False |
False
|
zero_based
|
`bool`
|
flag to subtract one from the node numbers in the returned node_coords dict. This is needed to support PstFrom. Default is False |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
node_coords |
Dictionary containing x and y coordinates for each node |
get_node_data()
Returns:
| Name | Type | Description |
|---|---|---|
nodedf |
a pd.DataFrame containing Node information; Node, X, Y, Z, layer, numverts, vertidx |
get_vertex_coordinates()
Returns:
| Type | Description |
|---|---|
|
Dictionary containing list of x, y and z coordinates for each vertex |
Logger
Bases: object
a basic class for logging events during the linear analysis calculations if filename is passed, then a file handle is opened.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
Filename to write logged events to. If False, no file will be created, and logged events will be displayed on standard out. |
required |
echo
|
`bool`
|
Flag to cause logged events to be echoed to the screen. |
False
|
log(phrase)
log something that happened.
Arg
phrase (str): statement to log
Notes
The first time phrase is passed the start time is saved. The second time the phrase is logged, the elapsed time is written
lraise(message)
log an exception, close the log file, then raise the exception
Arg
phrase (str): exception statement to log and raise
statement(phrase)
log a one-time statement
Arg
phrase (str): statement to log
warn(message)
write a warning to the log file.
Arg
phrase (str): warning statement to log
OrdinaryKrigeOrg
Bases: object
Ordinary Kriging using Pandas and Numpy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
geostruct
|
`GeoStruct`
|
a pyemu.geostats.GeoStruct to use for the kriging |
required |
point_data
|
`pandas.DataFrame`
|
the conditioning points to use for kriging.
|
required |
Note
if point_data is an str, then it is assumed to be a pilot points file
and is loaded as such using pyemu.pp_utils.pp_file_to_dataframe()
If zoned interpolation is used for grid-based interpolation, then
point_data must also contain a "zone" column
Example::
import pyemu
v = pyemu.utils.geostats.ExpVario(a=1000,contribution=1.0)
gs = pyemu.utils.geostats.GeoStruct(variograms=v,nugget=0.5)
pp_df = pyemu.pp_utils.pp_file_to_dataframe("hkpp.dat")
ok = pyemu.utils.geostats.OrdinaryKrige(gs,pp_df)
calc_factors(x, y, minpts_interp=1, maxpts_interp=20, search_radius=10000000000.0, verbose=False, pt_zone=None, forgive=False, num_threads=1, idx_vals=None, remove_negative_factors=True)
Calculate ordinary kriging factors (weights) for the points represented by arguments x and y
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
[`float`]
|
x-coordinates to calculate kriging factors for |
required |
y
|
([`float`]
|
y-coordinates to calculate kriging factors for |
required |
minpts_interp
|
`int`
|
minimum number of point_data entries to use for interpolation at
a given x,y interplation point. interpolation points with less
than |
1
|
maxpts_interp
|
`int`
|
maximum number of point_data entries to use for interpolation at
a given x,y interpolation point. A larger |
20
|
search_radius
|
`float`
|
the size of the region around a given x,y
interpolation point to search for |
10000000000.0
|
verbose
|
`bool`
|
a flag to echo process to stdout during the interpolatino process. Default is False |
False
|
pt_zone
|
int
|
Optional zone value to slice point_data for zoned factor calcs. |
None
|
forgive
|
`bool`
|
flag to continue if inversion of the kriging matrix fails at one or more
interpolation points. Inversion usually fails if the kriging matrix is singular,
resulting from |
False
|
num_threads
|
`int`
|
number of multiprocessing workers to use to try to speed up kriging in python. Default is 1. |
1
|
idx_vals
|
iterable of `int`
|
optional index values to use in the interpolation dataframe. This is used to set the proper node number in the factors file for unstructured grids. |
None
|
remove_negative_factors
|
`bool`
|
option to remove negative Kriging factors, following the method of Deutsch (1996) https://doi.org/10.1016/0098-3004(96)00005-2. Default is True |
True
|
Returns:
pandas.DataFrame: a dataframe with information summarizing the ordinary kriging
process for each interpolation points
Note
this method calls either OrdinaryKrige.calc_factors_org() or
OrdinaryKrige.calc_factors_mp() depending on the value of num_threads
Example::
v = pyemu.utils.geostats.ExpVario(a=1000,contribution=1.0)
gs = pyemu.utils.geostats.GeoStruct(variograms=v,nugget=0.5)
pp_df = pyemu.pp_utils.pp_file_to_dataframe("hkpp.dat")
ok = pyemu.utils.geostats.OrdinaryKrige(gs,pp_df)
x = np.arange(100)
y = np.ones_like(x)
zone_array = y.copy()
zone_array[:zone_array.shape[0]/2] = 2
# only calc factors for the points in zone 1
ok.calc_factors(x,y,pt_zone=1)
ok.to_grid_factors_file("zone_1.fac",ncol=x.shape[0])
calc_factors_grid(spatial_reference, zone_array=None, minpts_interp=1, maxpts_interp=20, search_radius=10000000000.0, verbose=False, var_filename=None, forgive=False, num_threads=1)
calculate kriging factors (weights) for a structured grid.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
`flopy.ModelGrid`)
|
a spatial
reference that describes the orientation and
spatial projection of the model grid. Needs attributes:
[ |
required | |
zone_array
|
`numpy.ndarray`
|
an integer array of zones to use for kriging.
If not None, then |
None
|
minpts_interp
|
`int`
|
minimum number of |
1
|
verbose
|
( |
required | |
var_filename
|
Path - like
|
a filename to save the kriging variance for each interpolated grid node. Default is None. |
None
|
forgive
|
`bool`
|
flag to continue if inversion of the kriging matrix fails at one or more
grid nodes. Inversion usually fails if the kriging matrix is singular,
resulting from |
False
|
num_threads
|
`int`
|
number of multiprocessing workers to use to try to speed up kriging in python. Default is 1. |
1
|
Returns:
pandas.DataFrame: a dataframe with information summarizing the ordinary kriging
process for each grid node. Not returned if try_use_ppu is True.
Note
this method calls OrdinaryKrige.calc_factors() this method is the main entry point for grid-based kriging factor generation
Example::
import flopy
v = pyemu.utils.geostats.ExpVario(a=1000,contribution=1.0)
gs = pyemu.utils.geostats.GeoStruct(variograms=v,nugget=0.5)
pp_df = pyemu.pp_utils.pp_file_to_dataframe("hkpp.dat")
ok = pyemu.utils.geostats.OrdinaryKrige(gs,pp_df)
m = flopy.modflow.Modflow.load("mymodel.nam")
df = ok.calc_factors_grid(m.sr,zone_array=m.bas6.ibound[0].array,
var_filename="ok_var.dat")
ok.to_grid_factor_file("factors.dat")
check_point_data_dist(rectify=False)
check for point_data entries that are closer than EPSILON distance - this will cause a singular kriging matrix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rectify
|
`bool`
|
flag to fix the problems with point_data by dropping additional points that are closer than EPSILON distance. Default is False |
False
|
Note
this method will issue warnings for points that are closer than EPSILON distance
to_grid_factors_file(filename, points_file='points.junk', zone_file='zone.junk', ncol=None)
write a PEST-style factors file. This file can be used with the fac2real() method to write an interpolated structured or unstructured array
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
path - like
|
factor filename |
required |
points_file
|
`str`
|
points filename to add to the header of the factors file. This is not used by the fac2real() method. Default is "points.junk" |
'points.junk'
|
zone_file
|
`str`
|
zone filename to add to the header of the factors file. This is not used by the fac2real() method. Default is "zone.junk" |
'zone.junk'
|
Note
this method should be called after OrdinaryKrige.calc_factors_grid() for structured models or after OrdinaryKrige.calc_factors() for unstructured models.
OrdinaryKrigePPU
Bases: object
Ordinary Kriging using Pypestutils PPU for factor calculation.
Off-loading the much of the computational effort to Pypestutils.
__init__(geostruct, point_data, express=False, auto=False)
Initialize OrdinaryKrigePPU object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
geostruct None
|
Dummy argument to match OrdinaryKrige signature. |
required | |
point_data
|
`pandas.DataFrame`
|
the conditioning points to use for kriging.
|
required |
auto
|
bool
|
if set to true, use pypestutil's auto krige function, which automatically estimates search radius and variogram parameters based on pilot point density |
False
|
Note
if point_data is an str, then it is assumed to be a pilot points file
and is loaded as such using pyemu.pp_utils.pp_file_to_dataframe()
calc_factors(targets, geostruct, fac_fname='factors.dat', zone_array=None, **kwargs)
PPU based calculation of interpolation factors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
targets
|
multiple
|
Object from which x and y coordinates can be extracted. This can be a flopy modelgrid object with (xcellcenters, and ycellcenters) attribs. Or it can be a pyemu spatial reference with (xcentergrid and ycentergrid) attribs. Or a DataFrame with ('x', 'y') columns. Or a {'node':(x,y)} dict. Note if type(targets)==dict, the keys will e sorted, zone array will be expected to be in the order of the sorted keys and the output will also be in that order. |
required |
geostruct
|
(Vario2d, Geostructure, dict or path - like)
|
If dict, should have keys 'corrlen' (required), 'aniso', 'bearing', 'vartype'. vartype is an integer with 1 for Spherical, 2 for Exponential (default) and 3 for Gaussian. |
required |
fac_fname
|
path - like
|
Filename to save binary factors file |
'factors.dat'
|
zone_array
|
array - like
|
|
None
|
**kwargs
|
dict
|
Additional keyword args for passing to factor calculations. e.g.: 'minpts_interp', 'maxpts_interp', 'search_dist' and 'search_radius' |
{}
|
Returns:
PortManager
Bases: object
Cross-platform port manager for parallel processes.
__init__(port_range=(4004, 4999), lock_dir=None, max_retries=50, lock_timeout=5, log_level=logging.INFO)
Initialize the port manager. Args: port_range: Tuple of (min_port, max_port) to search within lock_dir: Directory to store lock files (default: system temp dir) max_retries: Maximum attempts to find an available port lock_timeout: Time in seconds after which a lock is considered stale
get_available_port()
Find and reserve an available port. Returns: An available port number. Raises: RuntimeError: If no available port can be found after max_retries.
reserved_port()
Context manager that reserves a port and releases it after use.
PstFrom
Bases: object
construct high-dimensional PEST(++) interfaces with all the bells and whistles
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
original_d
|
`str` or Path
|
the path to a complete set of model input and output files |
required |
new_d
|
`str` or Path
|
the path to where the model files and PEST interface files will be copied/built |
required |
longnames
|
`bool`
|
flag to use longer-than-PEST-likes parameter and observation names. Default is True |
True
|
remove_existing
|
`bool`
|
flag to destroy any existing files and folders
in |
False
|
spatial_reference
|
varies
|
an object that facilitates geo-locating model cells based on index. Default is None |
None
|
zero_based
|
`bool`
|
flag if the model uses zero-based indices, Default is True |
True
|
start_datetime
|
`str` or Timestamp
|
a string that can be case to a datatime instance the represents the starting datetime of the model |
None
|
tpl_subfolder
|
`str`
|
option to write template files to a subfolder
within |
None
|
chunk_len
|
`int`
|
the size of each "chunk" of files to spawn a |
50
|
echo
|
`bool`
|
flag to echo logger messages to the screen. Default is True |
True
|
pp_solve_num_threads
|
`int`
|
number of threads to use for the pyemu very-slow kriging solve for pilot-point type parameters. Default is 10. |
10
|
Note
This is the way...
Example::
pf = PstFrom("path_to_model_files","new_dir_with_pest_stuff",start_datetime="1-1-2020")
pf.add_parameters("hk.dat")
pf.add_observations("heads.csv")
pf.build_pst("pest.pst")
pe = pf.draw(100)
pe.to_csv("prior.csv")
parfile_relations
property
build up a container of parameter file information. Called programmatically...
add_observations(filename, insfile=None, index_cols=None, use_cols=None, use_rows=None, prefix='', ofile_skip=None, ofile_sep=None, rebuild_pst=False, obsgp=None, zone_array=None, includes_header=True)
Add values in output files as observations to PstFrom object
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
model output file name(s) to set up
as observations. By default filename should give relative
location from top level of pest template directory
( |
required |
insfile
|
`str`
|
desired instructions file filename |
None
|
index_cols
|
`list`-like or `int`
|
columns to denote are indices for obs |
None
|
use_cols
|
`list`-like or `int`
|
columns to set up as obs. If None,
and |
None
|
use_rows
|
`list`-like or `int`
|
select only specific row of file for obs |
None
|
prefix
|
`str`
|
prefix for obsnmes |
''
|
ofile_skip
|
`int`
|
number of lines to skip in model output file |
None
|
ofile_sep
|
`str`
|
delimiter in output file.
If |
None
|
rebuild_pst
|
`bool`
|
(Re)Construct PstFrom.pst object after adding new obs |
False
|
obsgp
|
`str` of `list`-like
|
observation group name(s). If type
|
None
|
zone_array
|
`np.ndarray`
|
array defining spatial limits or zones for array-style observations. Default is None |
None
|
includes_header
|
`bool`
|
flag indicating that the list-style file includes a header row. Default is True. |
True
|
Returns:
| Type | Description |
|---|---|
|
|
Note
This is the main entry for adding observations to the pest interface
If index_cols and use_cols are both None, then it is assumed that
array-style observations are being requested. In this case,
filenames must be only one filename.
zone_array is only used for array-style observations. Zone values
less than or equal to zero are skipped (using the "dum" option)
Example::
# setup observations for the 2nd thru 5th columns of the csv file
# using the first column as the index
df = pf.add_observations("heads.csv",index_col=0,use_cols=[1,2,3,4],
ofile_sep=",")
# add array-style observations, skipping model cells with an ibound
# value less than or equal to zero
df = pf.add_observations("conce_array.dat,index_col=None,use_cols=None,
zone_array=ibound)
add_observations_from_ins(ins_file, out_file=None, pst_path=None, inschek=True)
add new observations to a control file from an existing instruction file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ins_file
|
`str`
|
instruction file with exclusively new
observation names. N.B. if |
required |
out_file
|
`str`
|
model output file. If None, then
ins_file.replace(".ins","") is used. Default is None.
If |
None
|
pst_path
|
`str`
|
the path to append to the instruction file and
out file in the control file. If not None, then any existing
path in front of the template or ins file is split off and
pst_path is prepended. If python is being run in a directory
other than where the control file will reside, it is useful
to pass |
None
|
inschek
|
`bool`
|
flag to try to process the existing output file
using the |
True
|
Returns:
| Type | Description |
|---|---|
|
|
Note
populates the new observation information with default values
Example::
pf = pyemu.PstFrom("temp","template")
pf.add_observations_from_ins(os.path.join("template","new_obs.dat.ins"),
pst_path=".")
add_parameters(filenames, par_type, zone_array=None, dist_type='gaussian', sigma_range=4.0, upper_bound=None, lower_bound=None, transform=None, par_name_base='p', index_cols=None, use_cols=None, use_rows=None, pargp=None, pp_space=None, use_pp_zones=None, num_eig_kl=100, spatial_reference=None, geostruct=None, datetime=None, mfile_fmt='free', mfile_skip=None, mfile_sep=None, ult_ubound=None, ult_lbound=None, rebuild_pst=False, alt_inst_str='inst', comment_char=None, par_style='multiplier', initial_value=None, pp_options=None, apply_order=999, apply_function=None)
Add list or array style model input files to PstFrom object. This method is the main entry point for adding parameters to the pest interface
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filenames
|
`str`
|
Model input filenames to parameterize. By default filename should give relative
location from top level of pest template directory
( |
required |
par_type
|
`str`
|
One of |
required |
zone_array
|
`np.ndarray`
|
array defining spatial limits or zones for parameterization. |
None
|
dist_type
|
not yet implemented # TODO |
'gaussian'
|
|
sigma_range
|
not yet implemented # TODO |
4.0
|
|
upper_bound
|
`float`
|
PEST parameter upper bound. If |
None
|
lower_bound
|
`float`
|
PEST parameter lower bound. If |
None
|
transform
|
`str`
|
PEST parameter transformation. Must be either "log","none" or "fixed. The "tied" transform
must be used after calling |
None
|
par_name_base
|
`str` or `list`-like
|
basename for parameters that
are set up. If parameter file is tabular list-style file
( |
'p'
|
index_cols
|
`list`-like
|
if not None, will attempt to parameterize
expecting a tabular-style model input file. |
None
|
use_cols
|
`list`-like or `int`
|
for tabular-style model input file, defines the columns to be parameterised |
None
|
use_rows
|
`list` or `tuple`
|
Setup parameters for
only specific rows in list-style model input file.
Action is dependent on the the dimensions of use_rows.
If ndim(use_rows) < 2: use_rows is assumed to represent the row number, index slicer (equiv df.iloc),
for all passed files (after headers stripped). So use_rows=[0,3,5], will parameterise the
1st, 4th and 6th rows of each passed list-like file.
If ndim(use_rows) = 2: use_rows represent the index value to parameterise according to index_cols.
e.g. [(3,5,6)] or [[3,5,6]] would attempt to set parameters where the model file
values for 3 |
None
|
pargp
|
`str`
|
Parameter group to assign pars to. This is PESTs
pargp but is also used to gather correlated parameters set up
using multiple |
None
|
pp_space
|
`float`, `int`,`str` or `pd.DataFrame`
|
Spatial pilot point information. DEPRECATED : use pp_options['pp_space'] instead. |
None
|
use_pp_zones
|
`bool`
|
a flag to use the greater-than-zero values DEPRECATED : use pp_options['use_pp_zones'] instead. |
None
|
num_eig_kl
|
TODO - implement with KL pars |
100
|
|
spatial_reference
|
`pyemu.helpers.SpatialReference`
|
If different
spatial reference required for pilotpoint setup.
If None spatial reference passed to |
None
|
geostruct
|
`pyemu.geostats.GeoStruct()`
|
For specifying correlation geostruct for pilot-points and par covariance. |
None
|
datetime
|
`str`
|
optional %Y%m%d string or datetime object for setting up temporally correlated pars. Where datetime is passed correlation axis for pars will be set to timedelta. |
None
|
mfile_fmt
|
`str`
|
format of model input file - this will be preserved |
'free'
|
mfile_skip
|
`int` or `str`
|
header in model input file to skip
when reading and reapply when writing. Can optionally be |
None
|
mfile_sep
|
`str`
|
separator/delimiter in model input file.
If None, separator will be interpreted from file name extension.
|
None
|
ult_ubound
|
`float`
|
Ultimate upper bound for model input parameter once all mults are applied - ensure physical model par vals. If not passed, it is set to 1.0e+30 |
None
|
ult_lbound
|
`float`
|
Ultimate lower bound for model input parameter once all mults are applied. If not passed, it is set to 1.0e-30 for log transform and -1.0e+30 for non-log transform |
None
|
rebuild_pst
|
`bool`
|
(Re)Construct PstFrom.pst object after adding new parameters |
False
|
alt_inst_str
|
`str`
|
Alternative to default |
'inst'
|
comment_char
|
`str`
|
option to skip comment lines in model file.
This is not additive with |
None
|
par_style
|
`str`
|
either "m"/"mult"/"multiplier", "a"/"add"/"addend", or "d"/"direct" where the former sets up a multiplier and addend parameters process against the existing model input array and the former sets up a template file to write the model input file directly. Default is "multiplier". |
'multiplier'
|
initial_value
|
`float`
|
the value to set for the |
None
|
pp_options
|
`dict`
|
Various options to control pilot point options. Can include:
If If If If If
|
None
|
apply_order
|
`int`
|
the optional order to process this set of parameters at runtime. Default is 999. |
999
|
apply_function
|
`str`
|
a python function to call during the apply process at runtime. Default is None. |
None
|
Returns:
pandas.DataFrame: dataframe with info for new parameters
Example::
# setup grid-scale direct parameters for an array of numbers
df = pf.add_parameters("hk.dat",par_type="grid",par_style="direct")
# setup pilot point multiplier parameters for an array of numbers
# with a pilot point being set in every 5th active model cell
df = pf.add_parameters("recharge.dat",par_type="pilotpoint",pp_space=5,
zone_array="ibound.dat")
# setup a single multiplier parameter for the 4th column
# of a column format (list/tabular type) file
df = pf.add_parameters("wel_list_1.dat",par_type="constant",
index_cols=[0,1,2],use_cols=[3])
add_py_function(file_name, call_str=None, is_pre_cmd=True, function_name=None)
add a python function to the forward run script
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_name
|
`str` or `callable`
|
a python source file or function/callable |
required |
call_str
|
`str`
|
the call string for python function in
|
None
|
is_pre_cmd
|
`bool` or `None`
|
flag to include |
True
|
function_name
|
`str`
|
DEPRECATED, used |
None
|
Returns: None
Note
call_str is expected to reference standalone a function
that contains all the imports it needs or these imports
should have been added to the forward run script through the
PstFrom.extra_py_imports list.
This function adds the call_str call to the forward
run script (either as a pre or post command or function not
directly called by main). It is up to users
to make sure call_str is a valid python function call
that includes the parentheses and requisite arguments
This function expects "def " + function_name to be flushed left
at the outer most indentation level
Example::
pf = PstFrom()
# add the function "mult_well_function" from the script file "preprocess.py" as a
# command to run before the model is run
pf.add_py_function("preprocess.py",
"mult_well_function(arg1='userarg')",
is_pre_cmd = True)
# add the post processor function "made_it_good" from the script file "post_processors.py"
pf.add_py_function("post_processors.py","make_it_good()",is_pre_cmd=False)
# add the function "another_func" from the script file "utils.py" as a
# function not called by main
pf.add_py_function("utils.py","another_func()",is_pre_cmd=None)
build_prior(fmt='ascii', filename=None, droptol=None, chunk=None, sigma_range=6)
Build the prior parameter covariance matrix
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fmt
|
`str`
|
the file format to save to. Default is "ASCII", can be "binary", "coo", or "none" |
'ascii'
|
filename
|
`str`
|
the filename to save the cov to |
None
|
droptol
|
`float`
|
absolute value of prior cov entries that are smaller than |
None
|
chunk
|
`int`
|
number of entries to write to binary/coo at once. Default is None (write all elements at once |
None
|
sigma_range
|
`int`
|
number of standard deviations represented by parameter bounds. Default is 6 (99% confidence). 4 would be approximately 95% confidence bounds |
6
|
Returns:
| Type | Description |
|---|---|
|
|
Note
This method processes parameters by group names
For really large numbers of parameters (>30K), this method
will cause memory errors. Luckily, in most cases, users
only want this matrix to generate a prior parameter ensemble
and the PstFrom.draw() is a better choice...
build_pst(filename=None, update=False, version=1)
Build control file from i/o files in PstFrom object.
Warning: This builds a pest control file from scratch, overwriting
anything already in self.pst object and anything already written to filename
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
the filename to save the control file to.
If None, the name is formed from the |
None
|
update
|
`bool`) or (str
|
flag to add to existing Pst object and rewrite. If string {'pars', 'obs'} just update respective components of Pst. Default is False - build from PstFrom components. |
False
|
version
|
`int`
|
control file version to write, Default is 1. If None, option to not write pst to file at pst_build() call -- handy when control file is huge pst object will be modified again before running. |
1
|
Note:
This builds a pest control file from scratch, overwriting anything already
in self.pst object and anything already written to filename
The new pest control file is assigned an NOPTMAX value of 0
draw(num_reals=100, sigma_range=6, use_specsim=False, scale_offset=True, rng=None)
Draw a parameter ensemble from the distribution implied by the initial parameter values in the control file and the prior parameter covariance matrix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_reals
|
`int`
|
the number of realizations to draw |
100
|
sigma_range
|
`int`
|
number of standard deviations represented by parameter bounds. Default is 6 (99% confidence). 4 would be approximately 95% confidence bounds |
6
|
use_specsim
|
`bool`
|
flag to use spectral simulation for grid-scale pars (highly recommended). Default is False |
False
|
scale_offset
|
`bool`
|
flag to apply scale and offset to parameter bounds before calculating prior variance. Dfault is True. If you are using non-default scale and/or offset and you get an exception during draw, try changing this value to False. |
True
|
rng
|
`numpy.random.RandomState`
|
random number generator if not using default from pyemu.en |
None
|
Returns:
| Type | Description |
|---|---|
|
|
Note
This method draws by parameter group
If you are using grid-style parameters, please use spectral simulation (use_specsim=True)
initialize_spatial_reference()
process the spatial reference argument. Called programmatically
parse_kij_args(args, kwargs)
parse args into kij indices. Called programmatically
write_forward_run()
write the forward run script. Called by build_pst()
PstFromFlopyModel
Bases: object
Deprecated Class. Try pyemu.utils.PstFrom() instead.
A legacy version can be accessed from pyemu.legacy, if desperate.
PyPestWorker
Bases: object
a pure python worker for pest++. the pest++ master doesnt even know...
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst
|
str or Pst
|
something about a control file |
required |
host
|
str
|
master hostname or IPv4 address |
required |
port
|
int
|
port number that the master is listening on |
required |
timeout
|
float
|
number of seconds to sleep at different points in the process. |
0.25
|
verbose
|
bool
|
flag to echo what's going on to stdout |
True
|
socket_timeout
|
float
|
number of seconds that the socket should wait before giving up. generally, this can be a big number... |
None
|
RunStor
Bases: object
__init__(filename)
access to the pest++ run storage file. Can be used to support usage of the pest++ external run manager
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
str
|
the name of a pest++ run storage file (ie pest.rns) |
required |
Example::
rns = pyemu.helpers.RunStor("pest.rns")
# get a dataframe of both parameter and observation
# values for all runs in the file.
df = rns.get_data()
# a function that processes the runs stored
# in df; the observation values in df should
# be updated "in place"
failed_idxs = process_my_model_runs(df)
#mark the failed runs
df.run_status.iloc[failed_idxs] = -99
#update the parameter and observation values
# stored in the rns file
rns.update(df)
file_info(filename)
staticmethod
get information about whats stored in the file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
str
|
the run storage file name |
required |
Returns:
| Name | Type | Description |
|---|---|---|
header |
dict
|
the file header |
par_names |
list
|
parameter names ordered as they occur in the file |
obs_names |
list
|
observation names ordered as they occur in the file |
get_data()
read the contents of the file into a dataframe
Returns:
| Name | Type | Description |
|---|---|---|
df |
DataFrame
|
the file contents |
header_dtype()
staticmethod
the numpy header dtype of the file
status_str(r_status)
staticmethod
update(df)
update the parameter and observation values
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df (pd.DataFrame)
|
file contents to update. Should be derived from the get_data() method to maintain dtypes and required information. The parameter and observation values for each run are updated "in place" in the file, as is the run_status int flag; this flag should be set to -99 for any runs that "failed". |
required |
SpatialReference
Bases: object
a class to locate a structured model grid in x-y space. Lifted wholesale from Flopy, and preserved here... ...maybe slightly over-engineered for here
Args:
delr (`numpy ndarray`): the model discretization delr vector (An array of spacings along a row)
delc (`numpy ndarray`): the model discretization delc vector (An array of spacings along a column)
lenuni (`int`): the length units flag from the discretization package. Default is 2.
xul (`float`): The x coordinate of the upper left corner of the grid. Enter either xul and yul or xll and yll.
yul (`float`): The y coordinate of the upper left corner of the grid. Enter either xul and yul or xll and yll.
xll (`float`): The x coordinate of the lower left corner of the grid. Enter either xul and yul or xll and yll.
yll (`float`): The y coordinate of the lower left corner of the grid. Enter either xul and yul or xll and yll.
rotation (`float`): The counter-clockwise rotation (in degrees) of the grid
proj4_str (`str`): a PROJ4 string that identifies the grid in space. warning: case sensitive!
units (`string`): Units for the grid. Must be either "feet" or "meters"
epsg (`int`): EPSG code that identifies the grid in space. Can be used in lieu of
proj4. PROJ4 attribute will auto-populate if there is an internet
connection(via get_proj4 method).
See https://www.epsg-registry.org/ or spatialreference.org
length_multiplier (`float`): multiplier to convert model units to spatial reference units.
delr and delc above will be multiplied by this value. (default=1.)
bounds
property
Return bounding box in shapely order.
length_multiplier
property
Attempt to identify multiplier for converting from model units to sr units, defaulting to 1.
vertices
property
Returns a list of vertices for
get_extent()
Get the extent of the rotated and offset grid
get_grid_lines()
Get the grid lines as a list
get_ij(x, y)
Return the row and column of a point or sequence of points in real-world coordinates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
`float`
|
scalar or sequence of x coordinates |
required |
y
|
`float`
|
scalar or sequence of y coordinates |
required |
Returns:
| Type | Description |
|---|---|
|
tuple of |
|
|
|
|
get_vertices(i, j)
Get vertices for a single cell or sequence if i, j locations.
get_xcenter_array()
Return a numpy one-dimensional float array that has the cell center x coordinate for every column in the grid in model space - not offset or rotated.
get_xedge_array()
Return a numpy one-dimensional float array that has the cell edge x coordinates for every column in the grid in model space - not offset or rotated. Array is of size (ncol + 1)
get_ycenter_array()
Return a numpy one-dimensional float array that has the cell center x coordinate for every row in the grid in model space - not offset of rotated.
get_yedge_array()
Return a numpy one-dimensional float array that has the cell edge y coordinates for every row in the grid in model space - not offset or rotated. Array is of size (nrow + 1)
load(namefile=None, reffile='usgs.model.reference')
staticmethod
Attempts to load spatial reference information from the following files (in order): 1) usgs.model.reference 2) NAM file (header comment) 3) SpatialReference.default dictionary
read_usgs_model_reference_file(reffile='usgs.model.reference')
staticmethod
read spatial reference info from the usgs.model.reference file https://water.usgs.gov/ogw/policy/gw-model/modelers-setup.html
rotate(x, y, theta, xorigin=0.0, yorigin=0.0)
staticmethod
Given x and y array-like values calculate the rotation about an arbitrary origin and then return the rotated coordinates. theta is in degrees.
set_spatialreference(xul=None, yul=None, xll=None, yll=None, rotation=0.0)
set spatial reference - can be called from model instance
transform(x, y, inverse=False)
Given x and y array-like values, apply rotation, scale and offset, to convert them from model coordinates to real-world coordinates.
write_gridspec(filename)
write a PEST-style grid specification file
SpecSim2d
Bases: object
2-D unconditional spectral simulation for regular grids
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
delx
|
`numpy.ndarray`
|
a 1-D array of x-dimension cell centers (or leading/trailing edges). Only the distance between points is important |
required |
dely
|
`numpy.ndarray`
|
a 1-D array of y-dimension cell centers (or leading/trailing edges). Only the distance between points is important |
required |
geostruct
|
`pyemu.geostats.Geostruct`
|
geostatistical structure instance |
required |
Example::
v = pyemu.utils.geostats.ExpVario(a=100,contribution=1.0)
gs = pyemu.utils.geostats.GeoStruct(variograms=v,nugget=0.5)
delx,dely = np.ones(150), np.ones(50)
ss = pyemu.utils.geostats.SpecSim2d(delx,dely,gs)
arr = np.squeeze(ss.draw_arrays(num_reals=1))*.05 + .08
plt.imshow(arr)
plt.colorbar(shrink=.40)
draw_arrays(num_reals=1, mean_value=1.0, rng=None)
draw realizations
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_reals
|
`int`
|
number of realizations to generate |
1
|
mean_value
|
`float`
|
the mean value of the realizations |
1.0
|
rng
|
`numpy.random.RandomState`
|
random number generator if not using default from pyemu.en |
None
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
is (num_reals,self.dely.shape[0],self.delx.shape[0]) |
Note:
log transformation is respected and the returned reals array is
in linear space
draw_conditional(seed, obs_points, sg, base_values_file, local=True, factors_file=None, num_reals=1, mean_value=1.0, R_factor=1.0)
Generate a conditional, correlated random field using the Spec2dSim object, a set of observation points, and a factors file.
The conditional field is made by generating an unconditional correlated random
field that captures the covariance in the variogram and conditioning it by kriging
a second surface using the value of the random field as observations.
This second conditioning surface provides an estimate of uncertainty (kriging error)
away from the observation points. At the observation points, the kriged surface is
equal to (less nugget effects) the observation. The conditioned correlated field
is then generated using: T(x) = Z(x) + [S(x) − S∗(x)]
where T(x) is the conditioned simulation, Z(x) is a kriging estimator of the
unknown field, S(x) is an unconditioned random field with the same covariance
structure as the desired field, and S∗(x) is a kriging estimate of the unconditioned
random field using its values at the observation points (pilot points).
[S(x) − S∗(x)] is an estimate of the kriging error.
This approach makes T(x) match the observed values at the observation points
(x_a, y_z), T(a) = Z(a), and have a structure away from the observation points that
follows the variogram used to generate Z, S, and S∗.
Chiles, J-P, and Delfiner, P., Geostatistics- Modeling Spatial Uncertainty: Wiley,
London, 695 p.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seed
|
`int`
|
integer used for random seed. If seed is used as a PEST parameter, then passing the same value for seed will yield the same conditioned random fields. This allows runs to be recreated given an ensemble of seeds. |
required |
obs_points
|
`str` or `dataframe`
|
locations for observation points. Either filename in pyemupilot point file format: ["name","x","y","zone","parval1"] ora dataframe with these columns. Note that parval1 is not used. |
required |
base_values_file
|
`str`
|
filename containing 2d array with the base parameter values from which the random field will depart (Z(x)) above. Values of Z(x) are used for conditioning, not parval1 in the observation point file. |
required |
factors_file
|
`str`
|
name of the factors file generated using the locations of the observation points and the target grid. If None this file will be generated and called conditional_factors.dat; but this is a slow step and should not generally be called for every simulation. |
None
|
sg
|
flopy StructuredGrid object |
required | |
local
|
`boolean`
|
whether coordinates in obs_points are in local (model) or map coordinates |
True
|
num_reals
|
`int`
|
number of realizations to generate |
1
|
mean_value
|
`float`
|
the mean value of the realizations |
1.0
|
R_factor
|
`float`
|
a factor to scale the field, sometimes the variation from the geostruct parameters is larger or smaller than desired. |
1.0
|
Returns:
| Type | Description |
|---|---|
|
|
Note:
log transformation is respected and the returned reals
array is in arithmetic space
grid_is_regular(delx, dely, tol=0.0001)
staticmethod
check that a grid is regular using delx and dely vectors
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
delx
|
|
required | |
dely
|
|
required | |
tol
|
|
required |
Returns:
| Type | Description |
|---|---|
|
|
grid_par_ensemble_helper(pst, gr_df, num_reals, sigma_range=6, logger=None, rng=None)
wrapper around SpecSim2d.draw() designed to support PstFromFlopy
and PstFrom grid-based parameters
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst
|
`pyemu.Pst`
|
a control file instance |
required |
gr_df
|
`pandas.DataFrame`
|
a dataframe listing |
required |
num_reals
|
`int`
|
number of realizations to generate |
required |
sigma_range
|
`float` (optional)
|
number of standard deviations implied by parameter bounds in control file. Default is 6 |
6
|
logger
|
`pyemu.Logger` (optional)
|
a logger instance for logging |
None
|
rng
|
`numpy.random.RandomState` (optional)
|
random number generator if not using default from pyemu.en |
None
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
realized grid-parameter values |
Note
the method processes each unique pargp value in gr_df and resets the sill of self.geostruct by
the maximum bounds-implied variance of each pargp. This method makes repeated calls to
self.initialize() to deal with the geostruct changes.
initialize()
prepare for spectral simulation.
Note
initialize() prepares for simulation by undertaking
the fast FFT on the wave number matrix and should be called
if the SpecSim2d.geostruct is changed.
This method is called by the constructor.
SphVario
Bases: Vario2d
Spherical variogram derived type
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contribution
|
float
|
sill of the variogram |
required |
a
|
`float`
|
(practical) range of correlation |
required |
anisotropy
|
`float`
|
Anisotropy ratio. Default is 1.0 |
1.0
|
bearing
|
( |
required | |
name
|
`str`, optinoal
|
name of the variogram. Default is "var1" |
'var1'
|
Example::
v = pyemu.utils.geostats.SphVario(a=1000,contribution=1.0)
bearing_rads
property
get the bearing of the Vario2d in radians
Returns:
| Type | Description |
|---|---|
|
|
rotation_coefs
property
get the rotation coefficients in radians
Returns:
| Type | Description |
|---|---|
|
[ |
__str__()
get the str representation of Vario2d
Returns:
| Type | Description |
|---|---|
|
|
covariance(pt0, pt1)
get the covariance between two points implied by Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pt0
|
([ |
required | |
pt1
|
([ |
required |
Returns:
| Type | Description |
|---|---|
|
|
covariance_matrix(x, y, names=None, cov=None)
build a pyemu.Cov instance implied by Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
[`float`]
|
x-coordinate locations |
required |
y
|
[`float`]
|
y-coordinate locations |
required |
names
|
[`str`]
|
names of locations. If None, cov must not be None |
None
|
cov
|
`pyemu.Cov`
|
an existing Cov instance. Vario2d contribution is added to cov |
None
|
Returns:
| Type | Description |
|---|---|
|
|
Note
either names or cov must not be None.
covariance_points(x0, y0, xother, yother)
get the covariance between base point (x0,y0) and
other points xother,yother implied by Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x0
|
`float`
|
x-coordinate |
required |
y0
|
`float`
|
y-coordinate |
required |
xother
|
[`float`]
|
x-coordinates of other points |
required |
yother
|
[`float`]
|
y-coordinates of other points |
required |
Returns:
| Type | Description |
|---|---|
|
|
|
|
points contained in xother, yother. len(cov) = len(xother) = |
|
|
len(yother) |
inv_h(h)
the inverse of the h_function. Used for plotting
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
h
|
`float`
|
the value of h_function to invert |
required |
Returns:
| Type | Description |
|---|---|
|
|
plot(**kwargs)
get a cheap plot of the Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
`dict`
|
keyword arguments to use for plotting |
{}
|
Returns:
| Type | Description |
|---|---|
|
|
Note
optional arguments in kwargs include
"ax" (existing matplotlib.pyplot.axis). Other
kwargs are passed to matplotlib.pyplot.plot()
to_struct_file(f)
write the Vario2d to a PEST-style structure file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
f
|
`str`
|
filename to write to. |
required |
Trie
Regex::Trie in Python. Creates a Trie out of a list of words. The trie can be exported to a Regex pattern. The corresponding Regex should match much faster than a simple Regex union.
Vario2d
Bases: object
base class for 2-D variograms.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contribution
|
float
|
sill of the variogram |
required |
a
|
`float`
|
(practical) range of correlation |
required |
anisotropy
|
`float`
|
Anisotropy ratio. Default is 1.0 |
1.0
|
bearing
|
( |
required | |
name
|
`str`, optinoal
|
name of the variogram. Default is "var1" |
'var1'
|
Note
This base class should not be instantiated directly as it does not implement an h_function() method.
bearing_rads
property
get the bearing of the Vario2d in radians
Returns:
| Type | Description |
|---|---|
|
|
rotation_coefs
property
get the rotation coefficients in radians
Returns:
| Type | Description |
|---|---|
|
[ |
__str__()
get the str representation of Vario2d
Returns:
| Type | Description |
|---|---|
|
|
covariance(pt0, pt1)
get the covariance between two points implied by Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pt0
|
([ |
required | |
pt1
|
([ |
required |
Returns:
| Type | Description |
|---|---|
|
|
covariance_matrix(x, y, names=None, cov=None)
build a pyemu.Cov instance implied by Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
[`float`]
|
x-coordinate locations |
required |
y
|
[`float`]
|
y-coordinate locations |
required |
names
|
[`str`]
|
names of locations. If None, cov must not be None |
None
|
cov
|
`pyemu.Cov`
|
an existing Cov instance. Vario2d contribution is added to cov |
None
|
Returns:
| Type | Description |
|---|---|
|
|
Note
either names or cov must not be None.
covariance_points(x0, y0, xother, yother)
get the covariance between base point (x0,y0) and
other points xother,yother implied by Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x0
|
`float`
|
x-coordinate |
required |
y0
|
`float`
|
y-coordinate |
required |
xother
|
[`float`]
|
x-coordinates of other points |
required |
yother
|
[`float`]
|
y-coordinates of other points |
required |
Returns:
| Type | Description |
|---|---|
|
|
|
|
points contained in xother, yother. len(cov) = len(xother) = |
|
|
len(yother) |
inv_h(h)
the inverse of the h_function. Used for plotting
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
h
|
`float`
|
the value of h_function to invert |
required |
Returns:
| Type | Description |
|---|---|
|
|
plot(**kwargs)
get a cheap plot of the Vario2d
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
`dict`
|
keyword arguments to use for plotting |
{}
|
Returns:
| Type | Description |
|---|---|
|
|
Note
optional arguments in kwargs include
"ax" (existing matplotlib.pyplot.axis). Other
kwargs are passed to matplotlib.pyplot.plot()
to_struct_file(f)
write the Vario2d to a PEST-style structure file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
f
|
`str`
|
filename to write to. |
required |
add_phi_as_obs(pst_name, pst_path='.')
apply_array_pars(arr_par='arr_pars.csv', arr_par_file=None, chunk_len=50)
a function to apply array-based multiplier parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr_par
|
`str` or `pandas.DataFrame`
|
if type |
'arr_pars.csv'
|
chunk_len (`int`)
|
the number of files to process per chunk with multiprocessing - applies to both fac2real and process_ input_files. Default is 50. |
required |
Note
Used to implement the parameterization constructed by PstFromFlopyModel during a forward run
This function should be added to the forward_run.py script but can be called on any correctly formatted csv
This function using multiprocessing, spawning one process for each
model input array (and optionally pp files). This speeds up
execution time considerably but means you need to make sure your
forward run script uses the proper multiprocessing idioms for
freeze support and main thread handling (PstFrom does this for you).
apply_gage_obs(return_obs_file=False)
apply the modflow gage obs post-processor
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
return_obs_file
|
`bool`
|
flag to return the processed
observation file. Default is |
False
|
Note
This is the companion function of gw_utils.setup_gage_obs()
apply_genericlist_pars(df, chunk_len=50)
a function to apply list style mult parameters
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame that relates files containing multipliers to model input file names. Required columns include: {"model_file": file name of resulatant model input file, "org_file": file name of original file that multipliers act on, "fmt": format specifier for model input file (currently on 'free' supported), "sep": separator for model input file if 'free' formatted, "head_rows": Number of header rows to transfer from orig file to model file, "index_cols": list of columns (either indexes or strings) to be used to align mults, orig and model files, "use_cols": columns to mults act on, "upper_bound": ultimate upper bound for model input file parameter, "lower_bound": ultimate lower bound for model input file parameter} |
required |
chunk_len
|
`int`
|
number of chunks for each multiprocessing instance to handle. Default is 50. |
50
|
Note
This function is called programmatically during the PstFrom forward run process
apply_hds_obs(hds_file, inact_abs_val=1e+20, precision='single', text='head')
process a modflow head save file. A companion function to
gw_utils.setup_hds_obs() that is called during the forward run process
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hds_file
|
`str`
|
a modflow head save filename. if hds_file ends with 'ucn', then the file is treated as a UcnFile type. |
required |
inact_abs_val
|
`float`
|
the value that marks the minimum and maximum
active value. values in the headsave file greater than |
1e+20
|
Returns:
pandas.DataFrame: a dataframe with extracted simulated values.
Note:
This is the companion function to gw_utils.setup_hds_obs().
apply_hds_timeseries(config_file=None, postprocess_inact=None)
process a modflow binary file using a previously written configuration file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_file
|
`str`
|
configuration file written by |
None
|
postprocess_inact
|
`float`
|
Inactive value in heads/ucn file e.g. mt.btn.cinit. If |
None
|
Note
This is the companion function of gw_utils.setup_hds_timeseries().
apply_hfb_pars(par_file='hfb6_pars.csv')
a function to apply HFB multiplier parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
par_file
|
`str`
|
the HFB parameter info file.
Default is |
'hfb6_pars.csv'
|
Note
This is the companion function to
gw_utils.write_hfb_zone_multipliers_template()
This is to account for the horrible HFB6 format that differs from other BCs making this a special case
Requires "hfb_pars.csv"
Should be added to the forward_run.py script
apply_list_and_array_pars(arr_par_file='mult2model_info.csv', chunk_len=50)
Apply multiplier parameters to list and array style model files
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr_par_file
|
str
|
|
'mult2model_info.csv'
|
chunk_len
|
`int`
|
the number of files to process per multiprocessing chunk in apply_array_pars(). default is 50. |
50
|
Returns:
Note
Used to implement the parameterization constructed by PstFrom during a forward run
Should be added to the forward_run.py script; added programmatically
by PstFrom.build_pst()
apply_mflist_budget_obs(list_filename, flx_filename='flux.dat', vol_filename='vol.dat', start_datetime='1-1-1970', times=None)
process a MODFLOW list file to extract flux and volume water budget entries.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
list_filename
|
`str`
|
path and name of the existing modflow list file |
required |
flx_filename
|
`str`
|
output filename that will contain the budget flux observations. Default is "flux.dat" |
'flux.dat'
|
vol_filename
|
`str`
|
output filename that will contain the budget volume observations. Default is "vol.dat" |
'vol.dat'
|
start_datetime
|
`str`
|
a string that can be parsed into a pandas.TimeStamp. This is used to give budget observations meaningful names. Default is "1-1-1970". |
'1-1-1970'
|
times
|
`np.ndarray`-like or `str`
|
An array of times to
extract from the budget dataframes returned by the flopy
MfListBudget(list_filename).get_dataframe() method. This can be
useful to ensure consistent observation times for PEST.
If type |
None
|
Note:
This is the companion function of gw_utils.setup_mflist_budget_obs().
Returns: tuple containing
- **pandas.DataFrame**: a dataframe with flux budget information
- **pandas.DataFrame**: a dataframe with cumulative budget information
apply_mtlist_budget_obs(list_filename, gw_filename='mtlist_gw.dat', sw_filename='mtlist_sw.dat', start_datetime='1-1-1970')
process an MT3D-USGS list file to extract mass budget entries.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
list_filename
|
`str`
|
the path and name of an existing MT3D-USGS list file |
required |
gw_filename
|
`str`
|
the name of the output file with gw mass budget information. Default is "mtlist_gw.dat" |
'mtlist_gw.dat'
|
sw_filename
|
`str`
|
the name of the output file with sw mass budget information. Default is "mtlist_sw.dat" |
'mtlist_sw.dat'
|
start_datatime
|
`str`
|
an str that can be cast to a pandas.TimeStamp. Used to give observations a meaningful name |
required |
Returns:
| Type | Description |
|---|---|
|
2-element tuple containing |
|
|
|
|
Note
This is the companion function of gw_utils.setup_mtlist_budget_obs().
apply_sfr_obs()
apply the sfr observation process
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: a dataframe of aggregated sfr segment aquifer and outflow |
Note
This is the companion function of gw_utils.setup_sfr_obs().
Requires sfr_obs.config.
Writes sfr_out_file+".processed", where sfr_out_file is defined in "sfr_obs.config"
apply_sfr_parameters(seg_pars=True, reach_pars=False)
thin wrapper around gw_utils.apply_sfr_seg_parameters()
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seg_pars
|
`bool`
|
flag to apply segment-based parameters. Default is True |
True
|
reach_pars
|
`bool`
|
flag to apply reach-based parameters. Default is False |
False
|
Returns:
| Type | Description |
|---|---|
|
flopy.modflow.ModflowSfr: the modified SFR package instance |
Note
Expects "sfr_seg_pars.config" to exist
Expects nam_file +"backup.sfr" to exist
apply_sfr_reach_obs()
apply the sfr reach observation process.
Returns:
| Type | Description |
|---|---|
|
|
Note
This is the companion function of gw_utils.setup_sfr_reach_obs().
Requires sfr_reach_obs.config.
Writes
apply_sfr_seg_parameters(seg_pars=True, reach_pars=False)
apply the SFR segment multiplier parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seg_pars
|
`bool`
|
flag to apply segment-based parameters. Default is True |
True
|
reach_pars
|
`bool`
|
flag to apply reach-based parameters. Default is False |
False
|
Returns:
| Type | Description |
|---|---|
|
flopy.modflow.ModflowSfr: the modified SFR package instance |
Note
Expects "sfr_seg_pars.config" to exist
Expects nam_file +"backup.sfr" to exist
apply_sft_obs()
process an mt3d-usgs sft ASCII output file using a previous-written config file
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: a dataframe of extracted simulated outputs |
Note
This is the companion function to gw_utils.setup_sft_obs().
apply_threshold_pars(csv_file)
apply the thresholding process. everything keys off of csv_file name...
Note: if the standard deviation of the continuous thresholding array is too low, the line search will fail. Currently, if this stdev is less than 1.e-10, then a homogeneous array of the first category fill value will be created. User beware!
autocorrelated_draw(pst, struct_dict, time_distance_col='distance', num_reals=100, verbose=True, enforce_bounds=False, draw_ineq=False, rng=None)
construct an autocorrelated observation noise ensemble from covariance matrices implied by geostatistical structure(s).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst
|
`pyemu.Pst`
|
a control file (or the name of control file). The
information in the |
required |
time_distance_col
|
str
|
the column in |
'distance'
|
struct_dict
|
`dict`
|
a dict of GeoStruct (or structure file), and list of observation names. |
required |
num_reals
|
`int`
|
number of realizations to draw. Default is 100 |
100
|
verbose
|
`bool`
|
flag to control output to stdout. Default is True. flag for stdout. |
True
|
enforce_bounds
|
`bool`
|
flag to enforce |
False
|
draw_ineq
|
`bool`
|
flag to generate noise realizations for inequality observations. If False, noise will not be added inequality observations in the ensemble. Default is False |
False
|
rng
|
`numpy.random.RandomState`
|
random number generator if not using default from pyemu.en |
None
|
Returns pyemu.ObservationEnsemble: the realized noise ensemble added to the observation values in the control file.
Note
The variance of each observation is used to scale the resulting geostatistical
covariance matrix (as defined by the weight or optional standard deviation.
Therefore, the sill of the geostatistical structures
in struct_dict should be 1.0
Example::
pst = pyemu.Pst("my.pst")
#assuming there is only one timeseries of observations
# and they are spaced one time unit apart
pst.observation_data.loc[:,"distance"] = np.arange(pst.nobs)
v = pyemu.geostats.ExpVario(a=10) #units of `a` are time units
gs = pyemu.geostats.Geostruct(variograms=v)
sd = {gs:["obs1","obs2",""obs3]}
oe = pyemu.helpers.autocorrelated_draws(pst,struct_dict=sd}
oe.to_csv("my_oe.csv")
build_jac_test_csv(pst, num_steps, par_names=None, forward=True)
build a dataframe of jactest inputs for use with pestpp-swp
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst
|
`pyemu.Pst`
|
existing control file |
required |
num_steps
|
`int`
|
number of perturbation steps for each parameter |
required |
par_names [`str`]
|
list of parameter names of pars to test. If None, all adjustable pars are used. Default is None |
required | |
forward
|
`bool`
|
flag to start with forward perturbations. Default is True |
True
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
for the jactesting. |
calc_array_par_summary_stats(arr_par_file='mult2model_info.csv')
read and generate summary statistics for the resulting model input arrays from applying array par multipliers
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr_par_file
|
`str`
|
the array multiplier key file |
'mult2model_info.csv'
|
Returns:
| Type | Description |
|---|---|
|
pd.DataFrame: dataframe of summary stats for each model_file entry |
Note
This function uses an optional "zone_file" column in the arr_par_file. If multiple zones
files are used, then zone arrays are aggregated to a single array
"dif" values are original array values minus model input array values
The outputs from the function can be used to monitor model input array changes that occur during PE/UQ analyses, especially when the parameters are multiplier types and the dimensionality is very high.
Consider using PstFrom.add_observations() to setup obs for the csv file
that this function writes.
calc_metric_ensemble(ens, pst, metric='all', bygroups=True, subset_realizations=None, drop_zero_weight=True)
Calculates unweighted metrics to quantify fit to observations for ensemble members
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ens
|
pandas DataFrame
|
DataFrame read from an observation |
required |
pst
|
pyemu.Pst object
|
needed to obtain observation values and weights |
required |
metric
|
list of str
|
metric to calculate (PBIAS, RMSE, MSE, NSE, MAE, NRMSE_SD, NRMSE_MEAN, NRMSE_IQ, NRMSE_MAXMIN) case insensitive Defaults to 'all' which calculates all available metrics |
'all'
|
bygroups
|
Bool
|
Flag to summarize by groups or not. Defaults to True. |
True
|
subset_realizations
|
iterable
|
Subset of realizations for which to report metric. Defaults to None which returns all realizations. |
None
|
drop_zero_weight
|
Bool
|
flag to exclude zero-weighted observations |
True
|
Returns: pandas.DataFrame: rows are realizations. Columns are groups. Content is requested metrics
calc_metric_res(res, metric='all', bygroups=True, drop_zero_weight=True)
Calculates unweighted metrics to quantify fit to observations for residuals
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
res
|
pandas DataFrame or filename
|
DataFrame read from a residuals file or filename |
required |
metric
|
list of str
|
metric to calculate (PBIAS, RMSE, MSE, NSE, MAE, NRMSE_SD, NRMSE_MEAN, NRMSE_IQ, NRMSE_MAXMIN) case insensitive Defaults to 'all' which calculates all available metrics |
'all'
|
bygroups
|
Bool
|
Flag to summarize by groups or not. Defaults to True. |
True
|
drop_zero_weight
|
Bool
|
flag to exclude zero-weighted observations |
True
|
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: single row. Columns are groups. Content is requested metrics |
calc_observation_ensemble_quantiles(ens, pst, quantiles, subset_obsnames=None, subset_obsgroups=None)
Given an observation ensemble, and requested quantiles, this function calculates the requested quantile point-by-point in the ensemble. This resulting set of values does not, however, correspond to a single realization in the ensemble. So, this function finds the minimum weighted squared distance to the quantile and labels it in the ensemble. Also indicates which realizations correspond to the selected quantiles.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ens
|
pandas DataFrame
|
DataFrame read from an observation |
required |
quantiles
|
iterable
|
quantiles ranging from 0-1.0 for which results requested |
required |
subset_obsnames
|
iterable
|
list of observation names to include in calculations |
None
|
subset_obsgroups
|
iterable
|
list of observation groups to include in calculations |
None
|
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
calc_phi(pst_name)
runtime function to calculate phi components from current outfiles
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst_name
|
control file name |
required |
Returns:
| Name | Type | Description |
|---|---|---|
DataFrame |
phi components |
calc_rmse_ensemble(ens, pst, bygroups=True, subset_realizations=None)
DEPRECATED -->please see pyemu.utils.metrics.calc_metric_ensemble() Calculates RMSE (without weights) to quantify fit to observations for ensemble members
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ens
|
pandas DataFrame
|
DataFrame read from an observation |
required |
bygroups
|
Bool
|
Flag to summarize by groups or not. Defaults to True. |
True
|
subset_realizations
|
iterable
|
Subset of realizations for which to report RMSE. Defaults to None which returns all realizations. |
None
|
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: rows are realizations. Columns are groups. Content is RMSE |
dataframe_to_smp(dataframe, smp_filename, name_col='name', datetime_col='datetime', value_col='value', datetime_format='dd/mm/yyyy', value_format='{0:15.6E}', max_name_len=12)
write a dataframe as an smp file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataframe
|
`pandas.DataFrame`
|
the dataframe to write to an SMP file. This dataframe should be in "long" form - columns for site name, datetime, and value. |
required |
smp_filename
|
`str`
|
smp file to write |
required |
name_col
|
`str`,optional
|
the name of the dataframe column that contains the site name. Default is "name" |
'name'
|
datetime_col
|
`str`
|
the column in the dataframe that the datetime values. Default is "datetime". |
'datetime'
|
value_col
|
`str`
|
the column in the dataframe that is the values |
'value'
|
datetime_format
|
`str`
|
The format to write the datetimes in the smp file. Can be either 'dd/mm/yyyy' or 'mm/dd/yyy'. Default is 'dd/mm/yyyy'. |
'dd/mm/yyyy'
|
value_format
|
`str`
|
a python float-compatible format. Default is "{0:15.6E}". |
'{0:15.6E}'
|
Example::
pyemu.smp_utils.dataframe_to_smp(df,"my.smp")
draw_by_group(pst, num_reals=100, sigma_range=6, use_specsim=False, struct_dict=None, delr=None, delc=None, scale_offset=True, echo=True, logger=False, rng=None)
Draw a parameter ensemble from the distribution implied by the initial parameter values in the control file and a prior parameter covariance matrix derived from grouped geostructures. Previously in pst_from.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst
|
`pyemu.Pst`
|
a control file instance |
required |
num_reals
|
`int`
|
the number of realizations to draw |
100
|
sigma_range
|
`int`
|
number of standard deviations represented by parameter bounds. Default is 6 (99% confidence). 4 would be approximately 95% confidence bounds |
6
|
use_specsim
|
`bool`
|
flag to use spectral simulation for grid-scale pars (highly recommended). Default is False |
False
|
struct_dict
|
`dict`
|
a dict with keys of GeoStruct (or structure file).
Dictionary values can depend on the values of |
None
|
delr
|
`list`
|
required for specsim ( |
None
|
delc
|
`list`
|
required for specsim ( |
None
|
scale_offset
|
`bool`
|
flag to apply scale and offset to parameter bounds before calculating prior variance. Dfault is True. If you are using non-default scale and/or offset and you get an exception during draw, try changing this value to False. |
True
|
echo
|
`bool`
|
Verbosity flag passed to new Logger instance if
|
True
|
logger
|
`pyemu.Logger`
|
Object for logging process |
False
|
rng
|
`numpy.random.RandomState`
|
random number generator if not using default from pyemu.en |
None
|
Returns:
| Type | Description |
|---|---|
|
|
Note
This method draws by parameter group
If you are using grid-style parameters, please use spectral simulation (use_specsim=True)
fac2real(pp_file=None, factors_file='factors.dat', out_file=None, upper_lim=1e+30, lower_lim=-1e+30, fill_value=1e+30, try_ppu=True, shape=None, mpts=None, transform='none', fac_ftype=0)
A replication of the PEST fac2real utility.
For creating an array from previously calculated kriging factors (weights). Will attempt to use pypestutils implementation if available and try_ppu is True, otherwise falls back to built-in pyemu implementation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pp_file
|
str or DataFrame
|
PEST-type pilot points file or DataFrame with at least 'name' and 'parval1' columns. |
None
|
factors_file
|
path - like
|
PEST-style factors file |
'factors.dat'
|
out_file
|
path - like
|
filename of array to write. If None, array is returned, else value of out_file is returned. Default is None. |
None
|
upper_lim
|
`float`
|
maximum interpolated value in the array. Values greater than
|
1e+30
|
lower_lim
|
`float`
|
minimum interpolated value in the array. Values less than
|
-1e+30
|
fill_value
|
`float`
|
the value to assign array nodes that are not interpolated |
1e+30
|
try_ppu
|
`bool`
|
try to use pypestutils function. Default is True. Note, this will likely fail is fac file was not set up through ppu methods, as currently defaulting to binary option for speed. |
True
|
PPU Specific Args
shape (tuple of int or str): ppu specific argument giving shape of the
output array. Required for ppu options. NOT USED IF try_ppu
IS False.
mpts: (int) ppu specific argument for number of target points for interp.
If not passed this will be implied from 'shape'. NOT USED IF try_ppu
IS False.
transform: (str) ppu specific argument for the data transform for interp.
'none' and 'log' are supported. Default is 'none'. NOT USED IF try_ppu
IS False.
fac_ftype: (int) ppu specific argument for the factors file type.
Default is 0 (binary). NOT USED IF try_ppu IS False.
Returns:
| Type | Description |
|---|---|
|
|
|
|
|
Example::
pyemu.utils.geostats.fac2real("hkpp.dat",out_file="hk_layer_1.ref")
first_order_pearson_tikhonov(pst, cov, reset=True, abs_drop_tol=0.001)
setup preferred-difference regularization from a covariance matrix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst
|
`pyemu.Pst`
|
the PEST control file |
required |
cov
|
`pyemu.Cov`
|
a covariance matrix instance with
some or all of the parameters listed in |
required |
reset
|
`bool`
|
a flag to remove any existing prior information equations in the control file. Default is True |
True
|
abs_drop_tol
|
`float`
|
tolerance to control how many pi equations are written. If the absolute value of the Pearson CC is less than abs_drop_tol, the prior information equation will not be included in the control file. |
0.001
|
Note
The weights on the prior information equations are the Pearson correlation coefficients implied by covariance matrix.
Operates in place
Example::
pst = pyemu.Pst("my.pst")
cov = pyemu.Cov.from_ascii("my.cov")
pyemu.helpers.first_order_pearson_tikhonov(pst,cov)
pst.write("my_reg.pst")
geostatistical_draws(pst, struct_dict, num_reals=100, sigma_range=4, verbose=True, scale_offset=True, subset=None, rng=None)
construct a parameter ensemble from a prior covariance matrix implied by geostatistical structure(s) and parameter bounds.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst
|
`pyemu.Pst`
|
a control file (or the name of control file). The
parameter bounds in |
required |
struct_dict
|
`dict`
|
a dict of GeoStruct (or structure file), and list of
pilot point template files pairs. If the values in the dict are
|
required |
num_reals
|
`int`
|
number of realizations to draw. Default is 100 |
100
|
sigma_range
|
`float`
|
a float representing the number of standard deviations implied by parameter bounds. Default is 4.0, which implies 95% confidence parameter bounds. |
4
|
verbose
|
`bool`
|
flag to control output to stdout. Default is True. flag for stdout. |
True
|
scale_offset
|
`bool`,optional
|
flag to apply scale and offset to parameter bounds
when calculating variances - this is passed through to |
True
|
subset
|
`array-like`
|
list, array, set or pandas index defining subset of parameters for draw. |
None
|
rng
|
`numpy.random.RandomState`
|
random number generator if not using default from pyemu.en |
None
|
Returns pyemu.ParameterEnsemble: the realized parameter ensemble.
Note
Parameters are realized by parameter group.
The variance of each parameter is used to scale the resulting geostatistical
covariance matrix Therefore, the sill of the geostatistical structures
in struct_dict should be 1.0
Example::
pst = pyemu.Pst("my.pst")
sd = {"struct.dat":["hkpp.dat.tpl","vka.dat.tpl"]}
pe = pyemu.helpers.geostatistical_draws(pst,struct_dict=sd}
pe.to_csv("my_pe.csv")
geostatistical_prior_builder(pst, struct_dict, sigma_range=4, verbose=False, scale_offset=False)
construct a full prior covariance matrix using geostastical structures and parameter bounds information.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst
|
`pyemu.Pst`
|
a control file instance (or the name of control file) |
required |
struct_dict
|
`dict`
|
a dict of GeoStruct (or structure file), and list of
pilot point template files pairs. If the values in the dict are
|
required |
sigma_range
|
`float`
|
a float representing the number of standard deviations implied by parameter bounds. Default is 4.0, which implies 95% confidence parameter bounds. |
4
|
verbose
|
`bool`
|
flag to control output to stdout. Default is True. flag for stdout. |
False
|
scale_offset
|
`bool`
|
a flag to apply scale and offset to parameter upper and lower bounds before applying log transform. Passed to pyemu.Cov.from_parameter_data(). Default is False |
False
|
Returns:
| Type | Description |
|---|---|
|
pyemu.Cov: a covariance matrix that includes all adjustable parameters in the control |
|
|
file. |
Note
The covariance of parameters associated with geostatistical structures is defined as a mixture of GeoStruct and bounds. That is, the GeoStruct is used to construct a pyemu.Cov, then the rows and columns of the pyemu.Cov block are scaled by the uncertainty implied by the bounds and sigma_range. Most users will want to sill of the geostruct to sum to 1.0 so that the resulting covariance matrices have variance proportional to the parameter bounds. Sounds complicated...
If the number of parameters exceeds about 20,000 this function may use all available memory
then crash your computer. In these high-dimensional cases, you probably dont need the prior
covariance matrix itself, but rather an ensemble of paraaeter realizations. In this case,
please use the geostatistical_draws() function.
Example::
pst = pyemu.Pst("my.pst")
sd = {"struct.dat":["hkpp.dat.tpl","vka.dat.tpl"]}
cov = pyemu.helpers.geostatistical_prior_builder(pst,struct_dict=sd}
cov.to_binary("prior.jcb")
get_filepath(folder, filename)
Return a path to a file within a folder, without repeating the folder in the output path, if the input filename (path) already contains the folder.
get_maha_obs_summary(sim_en, l1_crit_val=6.34, l2_crit_val=9.2, rng=None)
calculate the 1-D and 2-D mahalanobis distance between simulated ensemble and observed values. Used for detecting prior-data conflict
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sim_en
|
`pyemu.ObservationEnsemble`
|
a simulated outputs ensemble |
required |
l1_crit_val
|
`float`
|
the chi squared critical value for the 1-D mahalanobis distance. Default is 6.4 (p=0.01,df=1) |
6.34
|
l2_crit_val
|
`float`
|
the chi squared critical value for the 2-D mahalanobis distance. Default is 9.2 (p=0.01,df=2) |
9.2
|
rng
|
RandomState
|
random number generator if not using default from pyemu.en |
None
|
Returns:
tuple containing
- **pandas.DataFrame**: 1-D subspace squared mahalanobis distances
that exceed the `l1_crit_val` threshold
- **pandas.DataFrame**: 2-D subspace squared mahalanobis distances
that exceed the `l2_crit_val` threshold
Note
Noise realizations are added to sim_en to account for measurement
noise.
get_relative_filepath(folder, filename)
Like :func:~pyemu.utils.pst_from.get_filepath, except
return path for filename relative to folder.
get_zoned_ppoints_for_vertexgrid(spacing, zone_array, mg, zone_number=None, add_buffer=True)
Generate equally spaced pilot points for active area of DISV type MODFLOW6 grid.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spacing
|
`float`
|
spacing in model length units between pilot points. |
required |
zone_array
|
`numpy.ndarray`
|
the modflow 6 idomain integer array. This is used to set pilot points only in active areas and to assign zone numbers. |
required |
mg
|
`flopy.discretization.vertexgrid.VertexGrid`
|
a VertexGrid flopy discretization derived type. |
required |
zone_number
|
`int`
|
zone number |
None
|
add_buffer
|
`boolean`
|
specifies whether pilot points ar eplaced within a buffer zone of size |
True
|
Returns:
| Type | Description |
|---|---|
|
|
Example::
get_zoned_ppoints_for_vertexgrid(spacing=100, ib=idomain, mg, zone_number=1, add_buffer=False)
gpr_forward_run()
the function to evaluate a set of inputs thru the GPR emulators. This function gets added programmatically to the forward run process
gslib_2_dataframe(filename, attr_name=None, x_idx=0, y_idx=1)
function to read a GSLIB point data file into a pandas.DataFrame
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
GSLIB file |
required |
attr_name
|
`str`
|
the column name in the dataframe for the attribute.
If None, GSLIB file can have only 3 columns. |
None
|
x_idx
|
`int`
|
the index of the x-coordinate information in the GSLIB file. Default is 0 (first column) |
0
|
y_idx
|
`int`
|
the index of the y-coordinate information in the GSLIB file. Default is 1 (second column) |
1
|
Returns:
| Type | Description |
|---|---|
|
|
Note
assigns generic point names ("pt0, pt1, etc)
Example::
df = pyemu.utiils.geostats.gslib_2_dataframe("prop.gslib",attr_name="hk")
jco_from_pestpp_runstorage(rnj_filename, pst_filename)
read pars and obs from a pest++ serialized run storage file (e.g., .rnj) and return jacobian matrix instance
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rnj_filename
|
`str`
|
the name of the run storage file |
required |
pst_filename
|
`str`
|
the name of the pst file |
required |
Note
This can then be passed to Jco.to_binary or Jco.to_coo, etc., to write jco file in a subsequent step to avoid memory resource issues associated with very large problems.
Returns:
| Type | Description |
|---|---|
|
|
|
|
pest control file information. |
kl_apply(par_file, basis_file, par_to_file_dict, arr_shape)
Apply a KL parameterization transform from basis factors to model input arrays.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
par_file
|
`str`
|
the csv file to get factor values from. Must contain the following columns: "name", "new_val", "org_val" |
required |
basis_file
|
`str`
|
the PEST-style binary file that contains the reduced basis |
required |
par_to_file_dict
|
`dict`
|
a mapping from KL parameter prefixes to array file names. |
required |
arr_shape
|
tuple
|
a length 2 tuple of number of rows and columns the resulting arrays should have. |
required |
Note
|
This is the companion function to kl_setup. This function should be called during the forward run |
required |
kl_setup(num_eig, sr, struct, prefixes, factors_file='kl_factors.dat', islog=True, basis_file=None, tpl_dir='.')
setup a karhuenen-Loeve based parameterization for a given geostatistical structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_eig
|
`int`
|
the number of basis vectors to retain in the reduced basis |
required |
sr
|
`flopy.reference.SpatialReference`
|
a spatial reference instance |
required |
struct
|
`str`
|
a PEST-style structure file. Can also be a
|
required |
prefixes
|
[`str`]
|
a list of parameter prefixes to generate KL parameterization for. |
required |
factors_file
|
`str`
|
name of the PEST-style interpolation factors file to write (can be processed with FAC2REAL). Default is "kl_factors.dat". |
'kl_factors.dat'
|
islog
|
`bool`
|
flag to indicate if the parameters are log transformed. Default is True |
True
|
basis_file
|
`str`
|
the name of the PEST-style binary (e.g. jco) file to write the reduced basis vectors to. Default is None (not saved). |
None
|
tpl_dir
|
`str`
|
the directory to write the resulting template files to. Default is "." (current directory). |
'.'
|
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: a dataframe of parameter information. |
Note
This is the companion function to helpers.apply_kl()
Example::
m = flopy.modflow.Modflow.load("mymodel.nam")
prefixes = ["hk","vka","ss"]
df = pyemu.helpers.kl_setup(10,m.sr,"struct.dat",prefixes)
last_kstp_from_kper(hds, kper)
function to find the last time step (kstp) for a give stress period (kper) in a modflow head save file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hds
|
`flopy.utils.HeadFile`
|
head save file |
required |
kper
|
`int`
|
the zero-index stress period number |
required |
Returns:
| Type | Description |
|---|---|
|
int: the zero-based last time step during stress period |
|
|
kper in the head save file |
load_sfr_out(sfr_out_file, selection=None)
load an ASCII SFR output file into a dictionary of kper: dataframes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sfr_out_file
|
`str`
|
SFR ASCII output file |
required |
selection
|
`pandas.DataFrame`
|
a dataframe of |
None
|
Returns:
| Type | Description |
|---|---|
|
dict: dictionary of {kper: |
Note
Aggregates flow to aquifer for segments and returns and flow out at downstream end of segment.
load_sgems_exp_var(filename)
read an SGEM experimental variogram into a sequence of pandas.DataFrames
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
an SGEMS experimental variogram XML file |
required |
Returns:
| Type | Description |
|---|---|
|
[ |
|
|
division in the experimental variogram |
maha_based_pdc(sim_en)
prototype for detecting prior-data conflict following Alfonso and Oliver 2019
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sim_en
|
`pyemu.ObservationEnsemble`
|
a simulated outputs ensemble |
required |
Returns:
tuple containing
- **pandas.DataFrame**: 1-D subspace squared mahalanobis distances
that exceed the `l1_crit_val` threshold
- **pandas.DataFrame**: 2-D subspace squared mahalanobis distances
that exceed the `l2_crit_val` threshold
Note
Noise realizations are added to sim_en to account for measurement
noise.
modflow_hob_to_instruction_file(hob_file, ins_file=None)
write an instruction file for a modflow head observation file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hob_file
|
`str`
|
the path and name of the existing modflow hob file |
required |
ins_file
|
`str`
|
the name of the instruction file to write.
If |
None
|
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: a dataFrame with control file observation information |
modflow_hydmod_to_instruction_file(hydmod_file, ins_file=None)
write an instruction file for a modflow hydmod file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hydmod_file
|
`str`
|
the path and name of the existing modflow hob file |
required |
ins_file
|
`str`
|
the name of the instruction file to write.
If |
None
|
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: a dataFrame with control file observation information |
Note
calls pyemu.gw_utils.modflow_read_hydmod_file()
modflow_pval_to_template_file(pval_file, tpl_file=None)
write a template file for a modflow parameter value file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pval_file
|
`str`
|
the path and name of the existing modflow pval file |
required |
tpl_file
|
`str`
|
template file to write. If None, use
|
None
|
Note: Uses names in the first column in the pval file as par names.
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: a dataFrame with control file parameter information |
modflow_read_hydmod_file(hydmod_file, hydmod_outfile=None)
read a binary hydmod file and return a dataframe of the results
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hydmod_file
|
`str`
|
The path and name of the existing modflow hydmod binary file |
required |
hydmod_outfile
|
`str`
|
output file to write. If |
None
|
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: a dataFrame with hymod_file values |
modflow_sfr_gag_to_instruction_file(gage_output_file, ins_file=None, parse_filename=False)
writes an instruction file for an SFR gage output file to read Flow only at all times
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gage_output_file
|
`str`
|
the gage output filename (ASCII). |
required |
ins_file
|
`str`
|
the name of the instruction file to
create. If None, the name is |
None
|
parse_filename
|
`bool`
|
if True, get the gage_num parameter by parsing the gage output file filename if False, get the gage number from the file itself |
False
|
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
|
|
Note
Sets up observations for gage outputs only for the Flow column.
If parse_namefile is true, only text up to first '.' is used as the gage_num
org_fac2real(factors_file, pp_file=None, pp_data=None, fill_value=1e+30)
original pyemu implementation of fac2real Args: factors_file (str): filename of PEST-style factors file pp_file (str): filename of PEST-style pilot points file pp_data (pd.DataFrame): dataframe of pilot point data with at least 'name' and 'parval1' columns.
parse_dir_for_io_files(d, prepend_path=False)
find template/input file pairs and instruction file/output file pairs by extension.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
d
|
`str`
|
directory to search for interface files |
required |
prepend_path
|
`bool`
|
flag to prepend |
False
|
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
|
|
|
|
Note
the return values from this function can be passed straight to
pyemu.Pst.from_io_files() classmethod constructor.
Assumes the template file names are
Example::
files = pyemu.helpers.parse_dir_for_io_files("template",prepend_path=True)
pst = pyemu.Pst.from_io_files(*files,pst_path=".")
parse_rmr_file(rmr_file)
parse a run management record file into a data frame of tokens
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rmr_file
|
`str`
|
an rmr file name |
required |
Returns:
| Type | Description |
|---|---|
|
pd.DataFrame: a dataframe of timestamped information |
Note
only supports rmr files generated by pest++ version >= 5.1.21
parse_tpl_file(tpl_file)
parse a PEST-style template file to get the parameter names
Args:
tpl_file (str): path and name of a template file
Returns:
| Type | Description |
|---|---|
|
[ |
Example::
par_names = pyemu.pst_utils.parse_tpl_file("my.tpl")
pilot_points_from_shapefile(shapename)
read pilot points from shapefile into a dataframe
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
shapename
|
`str`
|
the shapefile name to read. |
required |
Notes
requires pyshp
pilot_points_to_tpl(pp_file, tpl_file=None, name_prefix=None)
write a template file for a pilot points file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pp_file
|
( |
required | |
tpl_file
|
`str`
|
template file name to write. If None,
|
None
|
name_prefix
|
`str`
|
name to prepend to parameter names for each
pilot point. For example, if |
None
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
(name,x,y,zone,parval1) with the parameter information |
|
|
(parnme,tpl_str) |
Example::
pyemu.pp_utils.pilot_points_to_tpl("my_pps.dat",name_prefix="my_pps")
pp_file_to_dataframe(pp_filename)
read a space-delim pilot point file to a pandas Dataframe
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pp_filename
|
`str`
|
path and name of an existing pilot point file |
required |
Returns:
| Type | Description |
|---|---|
|
|
Example::
df = pyemu.pp_utils.pp_file_to_dataframe("my_pp.dat")
pp_tpl_to_dataframe(tpl_filename)
read a pilot points template file to a pandas dataframe
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tpl_filename
|
`str`
|
path and name of an existing pilot points template file |
required |
Returns:
| Type | Description |
|---|---|
|
|
Notes
Use for processing pilot points since the point point file itself may have generic "names".
Example::
df = pyemu.pp_utils.pp_tpl_file_to_dataframe("my_pp.dat.tpl")
ppu_fac2real(source_vals, factor_filename, mpts, out_shape, ppulib=None, transform='none', fac_ftype='binary', krigtype='ordinary', noint=None)
Calculate array from pre-defined kriging factors using pypestutils
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_vals
|
ndarray
|
array of values for interpolating (e.g pilot point values): |
required |
factor_filename
|
str
|
filename of pre-built kriging factors |
required |
mpts
|
int
|
number of points to interpolate to. |
required |
out_shape
|
tuple
|
shape of output array. Used to reshape PPU results |
required |
ppulib
|
PestUtilsLib
|
Pre-instantiated pypestutils PestUtilsLib object. If None, a new instance will be created and cleaned up within this call. |
None
|
transform
|
str
|
Interpolation of transformed values. Options are "none" and "log". |
'none'
|
fac_ftype
|
str or int
|
factor_filename file type. Options are '0:binary' and '1:text'. Application to binary factor files is more efficient. |
'binary'
|
krigtype
|
int or str
|
kriging type. Options are 'ordinary' or 1 for ordinary kriging, 'simple' or 0 for simple kriging. |
'ordinary'
|
noint
|
float
|
Fallback interpolation value for locations where no interpolation is possible. If None, defaults to 1.0 for 'none' transform and 0.0 for 'log' transform. |
None
|
Returns:
prep_for_gpr(pst_fname, input_fnames, output_fnames, gpr_t_d='gpr_template', t_d='template', gp_kernel=None, nverf=0, plot_fits=False, apply_standard_scalar=False, include_emulated_std_obs=False)
helper function to setup a gaussian-process-regression (GPR) emulator for outputs of interest. This is primarily targeted at low-dimensional settings like those encountered in PESTPP-MOU
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst_fname
|
str
|
existing pest control filename |
required |
input_fnames
|
str | list[str]
|
usually a list of decision variable population files |
required |
output_fnames
|
str | list[str]
|
usually a list of observation population files that
corresponds to the simulation results associated with |
required |
gpr_t_d
|
str
|
the template file dir to create that will hold the GPR emulators |
'gpr_template'
|
t_d
|
str
|
the template dir containing the PESTPP-MOU outputs that the GPR emulators are trained on |
'template'
|
gp_kernel
|
sklearn GaussianProcess kernel
|
the kernel to use. if None, a standard RBF kernel is created and used |
None
|
nverf
|
int
|
the number of input-output pairs to hold back for a simple verification test |
0
|
plot_fits
|
bool
|
flag to plot the fit GPRs |
False
|
apply_standard_scalar
|
bool
|
flag to apply sklearn.preprocessing.StandardScaler transform before training/executing the emulator. Default is False |
False
|
include_emulated_std_obs
|
bool
|
flag to include the estimated standard deviation in the predicted response of each GPR emulator. If True, additional obserations are added to the GPR pest interface , one for each nominated observation quantity. Can be very useful for designing in-filling strategies |
False
|
Returns:
| Type | Description |
|---|---|
|
None |
Note
requires scikit-learn
prep_pp_hyperpars(file_tag, pp_filename, pp_info, out_filename, grid_dict, geostruct, arr_shape, pp_options, zone_array=None, ws='.')
Make preparation for hyperparameter ppu setup.
Writes files that to be used at apply time incl: ".gridinfo.dat" : target model cell location definition ".corrlen.dat", ".bearing.dat", ".aniso.dat", ".zone.dat" : arrays of hyperparamters to be used by calc_kriging_factors_2d() at apply time these arrays can also be paramterised (e.g. within a PstFrom add_pars setup).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_tag
|
str
|
tag for hyperpar files (will be cleaned of illegal chars) |
required |
pp_filename
|
str
|
pilot point file (filled by PEST) |
required |
pp_info
|
DataFrame
|
DataFrame of pilotpoint data -- written to pp_filename temporarily for testing apply_ppu_hyperpars func |
required |
out_filename
|
Path - like
|
Location of result of applying kriging interpolation after hyperpar factor calcs |
required |
grid_dict
|
dict
|
Dictionary of {node: (x,y)} for interp target. Note, sorted node keys defines the order for output arrays and of zone_array (if passed). |
required |
geostruct
|
GeoStruct
|
Geostructure used for filling initial hyperpar arrays. Note, Geostruct should only contain one variogram |
required |
arr_shape
|
(list - like, int)
|
shape of array in output file |
required |
pp_options
|
dict
|
Dictionary of pilot-point definition options to propagate to facotr calculation. Incl. "search_radius", "search_dist", "maxpts_interp", "minpts_interp", "fill_value" |
required |
zone_array
|
array - like
|
array of integer zonal values for target array locations. Default will be all target locations (in grid_dict) assigned to same zone. |
None
|
ws
|
Path - like
|
Location of saving files. |
'.'
|
Returns:
pst_from_io_files(tpl_files, in_files, ins_files, out_files, pst_filename=None, pst_path=None)
create a Pst instance from model interface files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tpl_files
|
[`str`]
|
list of template file names |
required |
in_files
|
[`str`]
|
list of model input file names (pairs with template files) |
required |
ins_files
|
[`str`]
|
list of instruction file names |
required |
out_files
|
[`str`]
|
list of model output file names (pairs with instruction files) |
required |
pst_filename
|
`str`
|
name of control file to write. If None, no file is written. Default is None |
None
|
pst_path
|
`str`
|
the path to append to the template_file and in_file in the control file. If
not None, then any existing path in front of the template or in file is split off
and pst_path is prepended. If python is being run in a directory other than where the control
file will reside, it is useful to pass |
None
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
found in |
Note
calls pyemu.helpers.pst_from_io_files()
Assigns generic values for parameter info. Tries to use INSCHEK to set somewhat meaningful observation values
all file paths are relatively to where python is running.
Example::
tpl_files = ["my.tpl"]
in_files = ["my.in"]
ins_files = ["my.ins"]
out_files = ["my.out"]
pst = pyemu.Pst.from_io_files(tpl_files,in_files,ins_files,out_files)
pst.control_data.noptmax = 0
pst.write("my.pst)
pst_from_parnames_obsnames(parnames, obsnames, tplfilename='model.input.tpl', insfilename='model.output.ins', out_dir='.')
Creates a Pst object from a list of parameter names and a list of observation names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parnames
|
`str`
|
list of parameter names |
required |
obsnames
|
`str`
|
list of observation names |
required |
tplfilename
|
`str`
|
template filename. Default is "model.input.tpl" |
'model.input.tpl'
|
insfilename
|
`str`
|
instruction filename. Default is "model.output.ins" |
'model.output.ins'
|
out_dir
|
str
|
Directory where template and instruction files should be saved. Default is the current working directory (".") |
'.'
|
Returns:
| Type | Description |
|---|---|
|
|
Example::
parnames = ["p1","p2"]
obsnames = ["o1","o2"]
pst = pyemu.helpers.pst_from_parnames_obsnames(parname,obsnames)
read_pestpp_runstorage(filename, irun=0, with_metadata=False)
read pars and obs from a specific run in a pest++ serialized run storage file (e.g. .rns/.rnj) into dataframes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
the name of the run storage file |
required |
irun
|
`int`
|
the run id to process. If 'all', then all runs are read. Default is 0 |
0
|
with_metadata
|
`bool`
|
flag to return run stats and info txt as well |
False
|
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
|
|
Note:
This function can save you heaps of misery of your pest++ run
died before writing output files...
read_sgems_variogram_xml(xml_file, return_type=GeoStruct)
function to read an SGEMS-type variogram XML file into
a GeoStruct
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
xml_file
|
`str`
|
SGEMS variogram XML file |
required |
return_type
|
`object`
|
the instance type to return. Default is |
GeoStruct
|
Returns:
| Name | Type | Description |
|---|---|---|
gs |
|
Example::
gs = pyemu.utils.geostats.read_sgems_variogram_xml("sgems.xml")
read_struct_file(struct_file, return_type=GeoStruct)
read an existing PEST-type structure file into a GeoStruct instance
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
struct_file
|
`str`
|
existing pest-type structure file |
required |
return_type
|
`object`
|
the instance type to return. Default is GeoStruct |
GeoStruct
|
Returns:
| Type | Description |
|---|---|
|
[ |
|
|
only one |
Example::
gs = pyemu.utils.geostats.reads_struct_file("struct.dat")
run(cmd_str, cwd='.', verbose=False, use_sp=False, **kwargs)
main run function so both run_sp and run_ossystem can coexist
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cmd_str
|
`str`
|
the str to execute with |
required |
cwd
|
`str`
|
the directory to execute the command in. Default is ".". |
'.'
|
verbose
|
`bool`
|
flag to echo to stdout the |
False
|
Notes:
by default calls run_ossystem which is the OG function from pyemu that uses os.system()
if use_sp is True, then run_sp is called which uses subprocess.Popen instead of os.system
Example:: pyemu.os_utils.run("pestpp-ies my.pst",cwd="template")
run_ossystem(cmd_str, cwd='.', verbose=False)
an OS agnostic function to execute a command line
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cmd_str
|
`str`
|
the str to execute with |
required |
cwd
|
`str`
|
the directory to execute the command in. Default is ".". |
'.'
|
verbose
|
`bool`
|
flag to echo to stdout the |
False
|
Notes
uses platform to detect OS and adds .exe suffix or ./ prefix as appropriate
if os.system returns non-zero, an exception is raised
Example::
pyemu.os_utils.run("pestpp-ies my.pst",cwd="template")
run_sp(cmd_str, cwd='.', verbose=True, logfile=False, **kwargs)
an OS agnostic function to execute a command line with subprocess
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cmd_str
|
`str`
|
the str to execute with |
required |
cwd
|
`str`
|
the directory to execute the command in. Default is ".". |
'.'
|
verbose
|
`bool`
|
flag to echo to stdout the |
True
|
shell
|
`bool`
|
flag to use shell=True in the |
required |
Notes
uses sp Popen to execute the command line. By default does not run in shell mode (ie. does not look for the exe in env variables)
series_to_insfile(out_file, ins_file=None)
convert a Pandas Series to an ins file Parameters
out_file : str name of the output file to convert to ins file ins_file : str name of the ins file to create. if None, then out_file+".ins" is used Returns
None
setup_fake_forward_run(pst, new_pst_name, org_cwd='.', bak_suffix='._bak', new_cwd='.')
setup a fake forward run for a pst.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst
|
`pyemu.Pst`
|
existing control file |
required |
new_pst_name
|
`str`
|
new control file to write |
required |
org_cwd
|
`str`
|
existing working dir. Default is "." |
'.'
|
bak_suffix
|
`str`
|
suffix to add to existing model output files when making backup copies. |
'._bak'
|
new_cwd
|
`str`
|
new working dir. Default is ".". |
'.'
|
Note
The fake forward run simply copies existing backup versions of model output files to the outfiles pest(pp) is looking for. This is really a development option for debugging PEST++ issues.
setup_gage_obs(gage_file, ins_file=None, start_datetime=None, times=None)
setup a forward run post processor routine for the modflow gage file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gage_file
|
`str`
|
the gage output file (ASCII) |
required |
ins_file
|
`str`
|
the name of the instruction file to create. If None, the name
is |
None
|
start_datetime
|
`str`
|
a |
None
|
times
|
[`float`]
|
a container of times to make observations for. If None, all times are used. Default is None. |
None
|
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
|
|
Note
Setups up observations for gage outputs (all columns).
This is the companion function of gw_utils.apply_gage_obs()
setup_hds_obs(hds_file, kperk_pairs=None, skip=None, prefix='hds', text='head', precision='single', include_path=False)
a function to setup using all values from a layer-stress period pair for observations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hds_file
|
`str`
|
path and name of an existing MODFLOW head-save file. If the hds_file endswith 'ucn', then the file is treated as a UcnFile type. |
required |
kperk_pairs
|
[(int, int)]
|
a list of len two tuples which are pairs of kper (zero-based stress period index) and k (zero-based layer index) to setup observations for. If None, then all layers and stress period records found in the file will be used. Caution: a shit-ton of observations may be produced! |
None
|
skip
|
variable
|
a value or function used to determine which values to skip when setting up observations. If np.scalar(skip) is True, then values equal to skip will not be used. If skip can also be a np.ndarry with dimensions equal to the model. Observations are set up only for cells with Non-zero values in the array. If not np.ndarray or np.scalar(skip), then skip will be treated as a lambda function that returns np.nan if the value should be skipped. |
None
|
prefix
|
`str`
|
the prefix to use for the observation names. default is "hds". |
'hds'
|
text
|
`str`
|
the text tag the flopy HeadFile instance. Default is "head" |
'head'
|
precision
|
`str`
|
the precision string for the flopy HeadFile instance. Default is "single" |
'single'
|
include_path
|
`bool`
|
flag to setup the binary file processing in directory where the hds_file |
False
|
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
Note
Writes an instruction file and a setup csv used construct a control file.
This is the companion function to gw_utils.apply_hds_obs().
setup_hds_timeseries(bin_file, kij_dict, prefix=None, include_path=False, model=None, postprocess_inact=None, text=None, fill=None, precision='single')
a function to setup a forward process to extract time-series style values from a binary modflow binary file (or equivalent format - hds, ucn, sub, cbb, etc).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bin_file
|
`str`
|
path and name of existing modflow binary file - headsave, cell budget and MT3D UCN supported. |
required |
kij_dict
|
`dict`
|
dictionary of site_name: [k,i,j] pairs. For example: |
required |
prefix
|
`str`
|
string to prepend to site_name when forming observation names. Default is None |
None
|
include_path
|
`bool`
|
flag to setup the binary file processing in directory where the hds_file is located (if different from where python is running). This is useful for setting up the process in separate directory for where python is running. |
False
|
model
|
`flopy.mbase`
|
a |
None
|
postprocess_inact
|
`float`
|
Inactive value in heads/ucn file e.g. mt.btn.cinit. If |
None
|
text
|
`str`
|
the text record entry in the binary file (e.g. "constant_head"). Used to indicate that the binary file is a MODFLOW cell-by-cell budget file. If None, headsave or MT3D unformatted concentration file is assumed. Default is None |
None
|
fill
|
`float`
|
fill value for NaNs in the extracted timeseries dataframe. If
|
None
|
precision
|
`str`
|
the precision of the binary file. Can be "single" or "double". Default is "single". |
'single'
|
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
Note
This function writes hds_timeseries.config that must be in the same
dir where apply_hds_timeseries() is called during the forward run
Assumes model time units are days
This is the companion function of gw_utils.apply_hds_timeseries().
setup_mflist_budget_obs(list_filename, flx_filename='flux.dat', vol_filename='vol.dat', start_datetime="1-1'1970", prefix='', save_setup_file=False, specify_times=None)
setup observations of budget volume and flux from modflow list file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
list_filename
|
`str`
|
path and name of the existing modflow list file |
required |
flx_filename
|
`str`
|
output filename that will contain the budget flux observations. Default is "flux.dat" |
'flux.dat'
|
vol_filename
|
`str`
|
output filename that will contain the budget volume observations. Default is "vol.dat" |
'vol.dat'
|
start_datetime
|
`str`
|
a string that can be parsed into a pandas.TimeStamp. This is used to give budget observations meaningful names. Default is "1-1-1970". |
"1-1'1970"
|
prefix
|
`str`
|
a prefix to add to the water budget observations. Useful if processing more than one list file as part of the forward run process. Default is ''. |
''
|
save_setup_file
|
`bool`
|
a flag to save "setup"+ |
False
|
specify_times
|
`np.ndarray`-like
|
An array of times to
extract from the budget dataframes returned by the flopy
MfListBudget(list_filename).get_dataframe() method. This can be
useful to ensure consistent observation times for PEST.
Array needs to be alignable with index of dataframe
return by flopy method, care should be take to ensure that
this is the case. If passed will be written to
"budget_times.config" file as strings to be read by the companion
|
None
|
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: a dataframe with information for constructing a control file. |
Note
This method writes instruction files and also a setup.csv to use when constructing a pest
control file. The instruction files are named
It is recommended to use the default values for flux_file and vol_file.
This is the companion function of gw_utils.apply_mflist_budget_obs().
setup_mtlist_budget_obs(list_filename, gw_filename='mtlist_gw.dat', sw_filename='mtlist_sw.dat', start_datetime='1-1-1970', gw_prefix='gw', sw_prefix='sw', save_setup_file=False)
setup observations of gw (and optionally sw) mass budgets from mt3dusgs list file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
list_filename
|
`str`
|
path and name of existing modflow list file |
required |
gw_filename
|
`str`
|
output filename that will contain the gw budget observations. Default is "mtlist_gw.dat" |
'mtlist_gw.dat'
|
sw_filename
|
`str`
|
output filename that will contain the sw budget observations. Default is "mtlist_sw.dat" |
'mtlist_sw.dat'
|
start_datetime
|
`str`
|
an str that can be parsed into a |
'1-1-1970'
|
gw_prefix
|
`str`
|
a prefix to add to the GW budget observations. Useful if processing more than one list file as part of the forward run process. Default is 'gw'. |
'gw'
|
sw_prefix
|
`str`
|
a prefix to add to the SW budget observations. Useful if processing more than one list file as part of the forward run process. Default is 'sw'. |
'sw'
|
save_setup_file
|
`bool`
|
a flag to save "setup"+ |
False
|
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
|
|
Note
writes an instruction file and also a setup.csv to use when constructing a pest control file
The instruction files are named out_filename +".ins"
It is recommended to use the default value for gw_filename or sw_filename.
This is the companion function of gw_utils.apply_mtlist_budget_obs().
setup_pilotpoints_grid(ml=None, sr=None, ibound=None, prefix_dict=None, every_n_cell=4, ninst=1, use_ibound_zones=False, pp_dir='.', tpl_dir='.', shapename='pp.shp', pp_filename_dict={})
setup a regularly-spaced (gridded) pilot point parameterization
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ml
|
`flopy.mbase`
|
a flopy mbase derived type. If None, |
None
|
sr
|
`flopy.utils.reference.SpatialReference`
|
a spatial reference use to
locate the model grid in space. If None, |
None
|
ibound
|
`numpy.ndarray`
|
the modflow ibound integer array. THis is used to
set pilot points only in active areas. If None and ml is None, then pilot points
are set in all rows and columns according to |
None
|
prefix_dict
|
`dict`
|
a dictionary of layer index, pilot point parameter prefix(es) pairs.
For example : |
None
|
ninst
|
`int`
|
Number of instances of pilot_points to set up. e.g. number of layers. If ml is None and prefix_dict is None, this is used to set up default prefix_dict. |
1
|
use_ibound_zones
|
`bool`
|
a flag to use the greater-than-zero values in the ibound as pilot point zones. If False ,ibound values greater than zero are treated as a single zone. Default is False. |
False
|
pp_dir
|
`str`
|
directory to write pilot point files to. Default is '.' |
'.'
|
tpl_dir
|
`str`
|
directory to write pilot point template file to. Default is '.' |
'.'
|
shapename
|
`str`
|
name of shapefile to write that contains pilot point information. Default is "pp.shp" |
'pp.shp'
|
pp_filename_dict
|
`dict`
|
optional dict of prefix-pp filename pairs. prefix values must
match the values in |
{}
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
written to |
Example::
m = flopy.modflow.Modflow.load("my.nam")
df = pyemu.pp_utils.setup_pilotpoints_grid(ml=m)
setup_sfr_obs(sfr_out_file, seg_group_dict=None, ins_file=None, model=None, include_path=False)
setup observations using the sfr ASCII output file. Setups the ability to aggregate flows for groups of segments. Applies only flow to aquier and flow out.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sft_out_file
|
`str`
|
the name and path to an existing SFR output file |
required |
seg_group_dict
|
`dict`
|
a dictionary of SFR segments to aggregate together for a single obs. the key value in the dict is the base observation name. If None, all segments are used as individual observations. Default is None |
None
|
model
|
`flopy.mbase`
|
a flopy model. If passed, the observation names will have the datetime of the observation appended to them. If None, the observation names will have the stress period appended to them. Default is None. |
None
|
include_path
|
`bool`
|
flag to prepend sfr_out_file path to sfr_obs.config. Useful for setting up process in separate directory for where python is running. |
False
|
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: dataframe of observation name, simulated value and group. |
Note
This is the companion function of gw_utils.apply_sfr_obs().
This function writes "sfr_obs.config" which must be kept in the dir where "gw_utils.apply_sfr_obs()" is being called during the forward run
setup_sfr_reach_obs(sfr_out_file, seg_reach=None, ins_file=None, model=None, include_path=False)
setup observations using the sfr ASCII output file. Setups sfr point observations using segment and reach numbers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sft_out_file
|
`str`
|
the path and name of an existing SFR output file |
required |
seg_reach
|
varies
|
a dict, or list of SFR [segment,reach] pairs identifying
locations of interest. If |
None
|
model
|
`flopy.mbase`
|
a flopy model. If passed, the observation names will have the datetime of the observation appended to them. If None, the observation names will have the stress period appended to them. Default is None. |
None
|
include_path
|
`bool`
|
a flag to prepend sfr_out_file path to sfr_obs.config. Useful for setting up process in separate directory for where python is running. |
False
|
Returns:
| Type | Description |
|---|---|
|
|
Note
This is the companion function of gw_utils.apply_sfr_reach_obs().
This function writes "sfr_reach_obs.config" which must be kept in the dir where "apply_sfr_reach_obs()" is being called during the forward run
setup_sfr_reach_parameters(nam_file, model_ws='.', par_cols=['strhc1'])
Setup multiplier parameters for reach data, when reachinput option is specified in sfr.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
nam_file
|
`str`
|
MODFLOw name file. DIS, BAS, and SFR must be
available as pathed in the nam_file. Optionally, |
required |
model_ws
|
`str`
|
model workspace for flopy to load the MODFLOW model from |
'.'
|
par_cols
|
[`str`]
|
a list of segment data entries to parameterize |
['strhc1']
|
tie_hcond
|
`bool`
|
flag to use same mult par for hcond1 and hcond2 for a
given segment. Default is |
required |
include_temporal_pars
|
[`str`]
|
list of spatially-global multipliers to set up for each stress period. Default is None |
required |
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: a dataframe with useful parameter setup information |
Note
Similar to gw_utils.setup_sfr_seg_parameters(), method will apply params to sfr reachdata
Can load the dis, bas, and sfr files with flopy using model_ws. Or can pass a model object (SFR loading can be slow)
This is the companion function of gw_utils.apply_sfr_reach_parameters()
Skips values = 0.0 since multipliers don't work for these
setup_sfr_seg_parameters(nam_file, model_ws='.', par_cols=None, tie_hcond=True, include_temporal_pars=None)
Setup multiplier parameters for SFR segment data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
nam_file
|
`str`
|
MODFLOw name file. DIS, BAS, and SFR must be
available as pathed in the nam_file. Optionally, |
required |
model_ws
|
`str`
|
model workspace for flopy to load the MODFLOW model from |
'.'
|
par_cols
|
[`str`]
|
a list of segment data entries to parameterize |
None
|
tie_hcond
|
`bool`
|
flag to use same mult par for hcond1 and hcond2 for a
given segment. Default is |
True
|
include_temporal_pars
|
[`str`]
|
list of spatially-global multipliers to set up for each stress period. Default is None |
None
|
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: a dataframe with useful parameter setup information |
Note
This function handles the standard input case, not all the cryptic SFR options. Loads the dis, bas, and sfr files with flopy using model_ws.
This is the companion function to gw_utils.apply_sfr_seg_parameters() .
The number (and numbering) of segment data entries must consistent across
all stress periods.
Writes nam_file +"backup.sfr" as the backup of the original sfr file
Skips values = 0.0 since multipliers don't work for these
setup_sft_obs(sft_file, ins_file=None, start_datetime=None, times=None, ncomp=1)
writes a post-processor and instruction file for a mt3d-usgs sft output file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sft_file
|
`str`
|
path and name of an existing sft output file (ASCII) |
required |
ins_file
|
`str`
|
the name of the instruction file to create.
If None, the name is |
None
|
start_datetime
|
`str`
|
a pandas.to_datetime() compatible str. If not None,
then the resulting observation names have the datetime
suffix. If None, the suffix is the output totim. Default
is |
None
|
times
|
[`float`]
|
a list of times to make observations for. If None, all times found in the file are used. Default is None. |
None
|
ncomp
|
`int`
|
number of components in transport model. Default is 1. |
1
|
Returns:
| Type | Description |
|---|---|
|
pandas.DataFrame: a dataframe with observation names and values for the sft simulated |
|
|
concentrations. |
Note
This is the companion function to gw_utils.apply_sft_obs().
setup_temporal_diff_obs(*args, **kwargs)
a helper function to setup difference-in-time observations based on an existing set of observations in an instruction file using the observation grouping in the control file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst
|
`pyemu.Pst`
|
existing control file |
required |
ins_file
|
`str`
|
an existing instruction file |
required |
out_file
|
`str`
|
an existing model output file that corresponds to
the instruction file. If None, |
required |
include_zero_weight
|
`bool`
|
flag to include zero-weighted observations in the difference observation process. Default is False so that only non-zero weighted observations are used. |
required |
include_path
|
`bool`
|
flag to setup the binary file processing in directory where the hds_file is located (if different from where python is running). This is useful for setting up the process in separate directory for where python is running. |
required |
sort_by_name
|
`bool`,optional
|
flag to sort observation names in each group prior to setting up the differencing. The order of the observations matters for the differencing. If False, then the control file order is used. If observation names have a datetime suffix, make sure the format is year-month-day to use this sorting. Default is True |
required |
long_names
|
`bool`
|
flag to use long, descriptive names by concatenating the two observation names that are being differenced. This will produce names that are too long for traditional PEST(_HP). Default is True. |
required |
prefix
|
`str`
|
prefix to prepend to observation names and group names. Default is "dif". |
required |
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
Note
This is the companion function of helpers.apply_temporal_diff_obs().
setup_threshold_pars(orgarr_file, cat_dict, testing_workspace='.', inact_arr=None)
setup a thresholding 2-category binary array process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
orgarr_file
|
`str`
|
the input array that will ultimately be created at runtime |
required |
cat_dict
|
`str`
|
dict of info for the two categories. Keys are (unused) category names. values are a len 2 iterable of requested proportion and fill value. |
required |
testing_workspace
|
`str`
|
directory where the apply process can be tested. |
'.'
|
inact_arr
|
`np.ndarray`
|
an array that indicates inactive nodes (inact_arr=0) |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
thresharr_file |
`str`
|
thresholding array file (to be parameterized) |
csv_file |
`str`
|
the csv file that has the inputs needed for the apply process |
Note
all required files are based on the orgarr_file with suffixes appended to them
This process was inspired by Todaro and others, 2023, "Experimental sandbox tracer
tests to characterize a two-facies aquifer via an ensemble smoother"
simple_ins_from_obs(obsnames, insfilename='model.output.ins', out_dir='.')
write a simple instruction file that reads the values named in obsnames in order, one per line from a model output file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obsnames
|
`str`
|
list of observation names to put in the new instruction file |
required |
insfilename
|
`str`
|
the name of the instruction file to create. Default is "model.output.ins" |
'model.output.ins'
|
out_dir
|
`str`
|
Directory where the instruction file should be saved. Default is the current working directory (".") |
'.'
|
Note
writes a file insfilename with each observation read off
of a single line
simple_tpl_from_pars(parnames, tplfilename='model.input.tpl', out_dir='.')
Make a simple template file from a list of parameter names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parnames
|
[`str`]
|
list of parameter names to put in the new template file |
required |
tplfilename
|
`str`
|
Name of the template file to create. Default is "model.input.tpl" |
'model.input.tpl'
|
out_dir
|
`str`
|
Directory where the template file should be saved. Default is the current working directory (".") |
'.'
|
Note
Writes a file tplfilename with each parameter name in parnames on a line
smp_to_dataframe(smp_filename, datetime_format=None)
load an smp file into a pandas dataframe
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
smp_filename
|
`str`
|
path and name of existing smp filename to load |
required |
datetime_format
|
`str`
|
The format of the datetime strings in the smp file. Can be either "%m/%d/%Y %H:%M:%S" or "%d/%m/%Y %H:%M:%S" If None, then we will try to deduce the format for you, which always dangerous. |
None
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
site names. Missing values are set to NaN. |
Example::
df = smp_to_dataframe("my.smp")
smp_to_ins(smp_filename, ins_filename=None, use_generic_names=False, gwutils_compliant=False, datetime_format=None, prefix='')
create an instruction file for an smp file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
smp_filename
|
`str`
|
path and name of an existing smp file |
required |
ins_filename
|
`str`
|
the name of the instruction
file to create. If None, |
None
|
use_generic_names
|
`bool`
|
flag to force observations names
to use a generic |
False
|
gwutils_compliant
|
`bool`
|
flag to use instruction set that is compliant with the PEST gw utils (fixed format instructions). If false, use free format (with whitespace) instruction set. Default is False |
False
|
datetime_format
|
`str`
|
string to pass to datetime.strptime in
the |
None
|
prefix
|
`str`
|
a prefix to add to the front of the derived observation names. Default is '' |
''
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
information with the observation names and |
|
|
instruction lines as additional columns. |
Example::
df = pyemu.smp_utils.smp_to_ins("my.smp")
start_workers(worker_dir, exe_rel_path, pst_rel_path, num_workers=None, worker_root='..', port=None, rel_path=None, local=True, cleanup=True, master_dir=None, verbose=False, silent_master=False, reuse_master=False, restart=False, ppw_function=None, ppw_kwargs={})
start a group of pest(++) workers on the local machine
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
worker_dir
|
`str`
|
the path to a complete set of input files need by PEST(++). This directory will be copied to make worker (and optionally the master) directories |
required |
exe_rel_path
|
`str`
|
the relative path to and name of the pest(++) executable from within
the |
required |
pst_rel_path
|
`str`
|
the relative path to and name of the pest control file from within
|
required |
num_workers
|
`int`
|
number of workers to start. defaults to number of cores |
None
|
worker_root
|
`str`
|
the root directory to make the new worker directories in. Default is ".." (up one directory from where python is running). |
'..'
|
rel_path
|
`str`
|
the relative path to where pest(++) should be run from within the worker_dir, defaults to the uppermost level of the worker dir. This option is usually not needed unless you are one of those crazy people who spreads files across countless subdirectories. |
None
|
local
|
`bool`
|
flag for using "localhost" instead of actual hostname/IP address on
worker command line. Default is True. |
True
|
cleanup
|
`bool`
|
flag to remove worker directories once processes exit. Default is True. Set to False for debugging issues |
True
|
master_dir
|
`str`
|
name of directory for master instance. If |
None
|
verbose
|
`bool`
|
flag to echo useful information to stdout. Default is False |
False
|
silent_master
|
`bool`
|
flag to pipe master output to devnull and instead print a simple message to stdout every few seconds. This is only for pestpp Travis testing so that log file sizes dont explode. Default is False |
False
|
reuse_master
|
`bool`
|
flag to use an existing |
False
|
restart
|
`bool`
|
flag to add a restart flag to the master start. If |
False
|
ppw_function
|
function
|
a function pointer that uses PyPestWorker to execute model runs. |
None
|
ppw_kwargs
|
dict
|
keyword arguments to pass to |
{}
|
Notes
If all workers (and optionally master) exit gracefully, then the worker
dirs will be removed unless cleanup is False
Example::
# start 10 workers using the directory "template" as the base case and
# also start a master instance in a directory "master".
pyemu.helpers.start_workers("template","pestpp-ies","pest.pst",10,master_dir="master",
worker_root=".")
try_process_output_file(ins_file, output_file=None, logger=None)
attempt to process a model output file using a PEST-style instruction file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ins_file
|
`str`
|
path and name of an instruction file |
required |
output_file
|
`str`,optional
|
path and name of existing model
output file to process. If |
None
|
Returns:
| Type | Description |
|---|---|
|
|
|
|
extracted from |
Note
If an exception is raised when processing the output file, the exception
is echoed to the screen and None is returned.
Example::
df = pyemu.pst_utils.try_process_output_file("my.ins","my.output")
write_array_tpl(name, tpl_filename, suffix, par_type, data_array=None, zone_array=None, gpname=None, fill_value=1.0, get_xy=None, input_filename=None, par_style='m', headerlines=None)
write a template file for a 2D array.
Args:
name (str): the base parameter name
tpl_filename (str): the template file to write - include path
suffix (str): suffix to append to par names
par_type (str): type of parameter
data_array (numpy.ndarray): original data array
zone_array (numpy.ndarray): an array used to skip inactive cells. Values less than 1 are
not parameterized and are assigned a value of fill_value. Default is None.
gpname (str): pargp filed in dataframe
fill_value:
get_xy:
input_filename:
par_style (str): either 'd','a', or 'm'
Returns:
| Name | Type | Description |
|---|---|---|
df |
`pandas.DataFrame`
|
a dataframe with parameter information |
Note
This function is called by PstFrom programmatically
write_hfb_template(m)
write a template file for an hfb (yuck!)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
m
|
`flopy.modflow.Modflow`
|
a model instance with an HFB package |
required |
Returns: tuple containing
- **str**: name of the template file that was created
- **pandas.DataFrame**: a dataframe with use control file info for the
HFB parameters
write_hfb_zone_multipliers_template(m)
write a template file for an hfb using multipliers per zone (double yuck!)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
m
|
`flopy.modflow.Modflow`
|
a model instance with an HFB package |
required |
Returns:
| Type | Description |
|---|---|
|
tuple containing |
|
|
|
|
write_list_tpl(filenames, dfs, name, tpl_filename, index_cols, par_type, use_cols=None, use_rows=None, suffix='', zone_array=None, gpname=None, get_xy=None, ij_in_idx=None, xy_in_idx=None, zero_based=True, input_filename=None, par_style='m', headerlines=None, fill_value=1.0, logger=None)
Write template files for a list style input.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filenames
|
`str` of `container` of `str`
|
original input filenames |
required |
dfs
|
`pandas.DataFrame` or `container` of pandas.DataFrames
|
pandas representations of input file. |
required |
name
|
`str` or container of str
|
parameter name prefixes. If more that one column to be parameterised, must be a container of strings providing the prefix for the parameters in the different columns. |
required |
tpl_filename
|
`str`
|
Path (from current execution directory) for desired template file |
required |
index_cols
|
`list`
|
column names to use as indices in tabular input dataframe |
required |
par_type
|
`str`
|
'constant','zone', or 'grid' used in parname
generation. If |
required |
use_cols
|
`list`
|
Columns in tabular input file to paramerterise. If None, pars are set up for all columns apart from index cols. |
None
|
use_rows
|
`list` of `int` or `tuple`
|
Setup parameters for only
specific rows in list-style model input file.
If list of |
None
|
suffix
|
`str`
|
Optional par name suffix |
''
|
zone_array
|
`np.ndarray`
|
Array defining zone divisions.
If not None and |
None
|
get_xy
|
`pyemu.PstFrom` method
|
Can be specified to get real-world xy
from |
None
|
ij_in_idx
|
`list` or `array`
|
defining which |
None
|
xy_in_idx
|
`list` or `array`
|
defining which |
None
|
zero_based
|
`boolean`
|
IMPORTANT - pass as False if |
True
|
input_filename
|
`str`
|
Path to input file (paired with tpl file) |
None
|
par_style
|
`str`
|
either 'd','a', or 'm' |
'm'
|
headerlines
|
[`str`]
|
optional header lines in the original model file, used for direct style parameters |
None
|
Returns:
pandas.DataFrame: dataframe with info for the new parameters
Note
This function is called by PstFrom programmatically
write_pp_file(filename, pp_df)
write a pilot points dataframe to a pilot points file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
pilot points file to write |
required |
pp_df
|
`pandas.DataFrame`
|
a dataframe that has at least columns "x","y","zone", and "value" |
required |
write_pp_shapfile(pp_df, shapename=None)
write pilot points dataframe to a shapefile
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pp_df
|
`pandas.DataFrame`
|
pilot point dataframe (must include "x" and "y"
columns). If |
required |
shapename
|
`str`
|
the shapefile name to write. If |
None
|
Notes
requires pyshp
zero_order_tikhonov(pst, parbounds=True, par_groups=None, reset=True)
setup preferred-value regularization in a pest control file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pst
|
`pyemu.Pst`
|
the control file instance |
required |
parbounds
|
`bool`
|
flag to weight the new prior information equations according to parameter bound width - approx the KL transform. Default is True |
True
|
par_groups
|
`list`
|
a list of parameter groups to build PI equations for. If None, all adjustable parameters are used. Default is None |
None
|
reset
|
`bool`
|
a flag to remove any existing prior information equations in the control file. Default is True |
True
|
Note
Operates in place.
Example::
pst = pyemu.Pst("my.pst")
pyemu.helpers.zero_order_tikhonov(pst)
pst.write("my_reg.pst")