Skip to content

pstfromflopy

PstFromFlopyModel

Bases: object

a monster helper class to setup a complex PEST interface around an existing MODFLOW-2005-family model.

Parameters:

Name Type Description Default
model `flopy.mbase`

a loaded flopy model instance. If model is an str, it is treated as a MODFLOW nam file (requires org_model_ws)

required
new_model_ws `str`

a directory where the new version of MODFLOW input files and PEST(++) files will be written

required
org_model_ws `str`

directory to existing MODFLOW model files. Required if model argument is an str. Default is None

None
pp_props [[`str`,[`int`]]]

pilot point multiplier parameters for grid-based properties. A nested list of grid-scale model properties to parameterize using name, iterable pairs. For 3D properties, the iterable is zero-based layer indices. For example, ["lpf.hk",[0,1,2,]] would setup pilot point multiplier parameters for layer property file horizontal hydraulic conductivity for model layers 1,2, and 3. For time-varying properties (e.g. recharge), the iterable is for zero-based stress period indices. For example, ["rch.rech",[0,4,10,15]] would setup pilot point multiplier parameters for recharge for stress period 1,5,11,and 16.

[]
const_props [[`str`,[`int`]]]

constant (uniform) multiplier parameters for grid-based properties. A nested list of grid-scale model properties to parameterize using name, iterable pairs. For 3D properties, the iterable is zero-based layer indices. For example, ["lpf.hk",[0,1,2,]] would setup constant (uniform) multiplier parameters for layer property file horizontal hydraulic conductivity for model layers 1,2, and 3. For time-varying properties (e.g. recharge), the iterable is for zero-based stress period indices. For example, ["rch.rech",[0,4,10,15]] would setup constant (uniform) multiplier parameters for recharge for stress period 1,5,11,and 16.

[]
temporal_list_props [[`str`,[`int`]]]

list-type input stress-period level multiplier parameters. A nested list of list-type input elements to parameterize using name, iterable pairs. The iterable is zero-based stress-period indices. For example, to setup multipliers for WEL flux and for RIV conductance, temporal_list_props = [["wel.flux",[0,1,2]],["riv.cond",None]] would setup multiplier parameters for well flux for stress periods 1,2 and 3 and would setup one single river conductance multiplier parameter that is applied to all stress periods

[]
spatial_list_props [[`str`,[`int`]]]

list-type input for spatial multiplier parameters. A nested list of list-type elements to parameterize using names (e.g. [["riv.cond",0],["wel.flux",1] to setup up cell-based parameters for each list-type element listed. These multiplier parameters are applied across all stress periods. For this to work, there must be the same number of entries for all stress periods. If more than one list element of the same type is in a single cell, only one parameter is used to multiply all lists in the same cell.

[]
grid_props [[`str`,[`int`]]]

grid-based (every active model cell) multiplier parameters. A nested list of grid-scale model properties to parameterize using name, iterable pairs. For 3D properties, the iterable is zero-based layer indices (e.g., ["lpf.hk",[0,1,2,]] would setup a multiplier parameter for layer property file horizontal hydraulic conductivity for model layers 1,2, and 3 in every active model cell). For time-varying properties (e.g. recharge), the iterable is for zero-based stress period indices. For example, ["rch.rech",[0,4,10,15]] would setup grid-based multiplier parameters in every active model cell for recharge for stress period 1,5,11,and 16.

[]
sfr_pars `bool`

setup parameters for the stream flow routing modflow package. If list is passed it defines the parameters to set up.

False
grid_geostruct `pyemu.geostats.GeoStruct`

the geostatistical structure to build the prior parameter covariance matrix elements for grid-based parameters. If None, a generic GeoStruct is created using an "a" parameter that is 10 times the max cell size. Default is None

None
pp_space `int`

number of grid cells between pilot points. If None, use the default in pyemu.pp_utils.setup_pilot_points_grid. Default is None

None
zone_props [[`str`,[`int`]]]

zone-based multiplier parameters. A nested list of zone-based model properties to parameterize using name, iterable pairs. For 3D properties, the iterable is zero-based layer indices (e.g., ["lpf.hk",[0,1,2,]] would setup a multiplier parameter for layer property file horizontal hydraulic conductivity for model layers 1,2, and 3 for unique zone values in the ibound array. For time-varying properties (e.g. recharge), the iterable is for zero-based stress period indices. For example, ["rch.rech",[0,4,10,15]] would setup zone-based multiplier parameters for recharge for stress period 1,5,11,and 16.

[]
pp_geostruct `pyemu.geostats.GeoStruct`

the geostatistical structure to use for building the prior parameter covariance matrix for pilot point parameters. If None, a generic GeoStruct is created using pp_space and grid-spacing information. Default is None

None
par_bounds_dict `dict`

a dictionary of model property/boundary condition name, upper-lower bound pairs. For example, par_bounds_dict = {"hk":[0.01,100.0],"flux":[0.5,2.0]} would set the bounds for horizontal hydraulic conductivity to 0.001 and 100.0 and set the bounds for flux parameters to 0.5 and 2.0. For parameters not found in par_bounds_dict, pyemu.helpers.wildass_guess_par_bounds_dict is used to set somewhat meaningful bounds. Default is None

None
temporal_list_geostruct `pyemu.geostats.GeoStruct`

the geostastical structure to build the prior parameter covariance matrix for time-varying list-type multiplier parameters. This GeoStruct express the time correlation so that the 'a' parameter is the length of time that boundary condition multiplier parameters are correlated across. If None, then a generic GeoStruct is created that uses an 'a' parameter of 3 stress periods. Default is None

None
spatial_list_geostruct `pyemu.geostats.GeoStruct`

the geostastical structure to build the prior parameter covariance matrix for spatially-varying list-type multiplier parameters. If None, a generic GeoStruct is created using an "a" parameter that is 10 times the max cell size. Default is None.

None
remove_existing `bool`

a flag to remove an existing new_model_ws directory. If False and new_model_ws exists, an exception is raised. If True and new_model_ws exists, the directory is destroyed - user beware! Default is False.

False
k_zone_dict `dict`

a dictionary of zero-based layer index, zone array pairs. e.g. {lay: np.2darray} Used to override using ibound zones for zone-based parameterization. If None, use ibound values greater than zero as zones. Alternatively a dictionary of dictionaries can be passed to allow different zones to be defined for different parameters. e.g. {"upw.hk" {lay: np.2darray}, "extra.rc11" {lay: np.2darray}} or {"hk" {lay: np.2darray}, "rc11" {lay: np.2darray}}

None
use_pp_zones `bool`

a flag to use ibound zones (or k_zone_dict, see above) as pilot point zones. If False, ibound values greater than zero are treated as a single zone for pilot points. Default is False

False
obssim_smp_pairs ([[`str`,`str`]]

a list of observed-simulated PEST-type SMP file pairs to get observations from and include in the control file. Default is []

required
external_tpl_in_pairs ([[`str`,`str`]]

a list of existing template file, model input file pairs to parse parameters from and include in the control file. Default is []

required
external_ins_out_pairs ([[`str`,`str`]]

a list of existing instruction file, model output file pairs to parse observations from and include in the control file. Default is []

required
extra_pre_cmds [`str`]

a list of preprocessing commands to add to the forward_run.py script commands are executed with os.system() within forward_run.py. Default is None.

None
redirect_forward_output `bool`

flag for whether to redirect forward model output to text files (True) or allow model output to be directed to the screen (False). Default is True

True
extra_post_cmds [`str`]

a list of post-processing commands to add to the forward_run.py script. Commands are executed with os.system() within forward_run.py. Default is None.

None
tmp_files [`str`]

a list of temporary files that should be removed at the start of the forward run script. Default is [].

None
model_exe_name `str`

binary name to run modflow. If None, a default from flopy is used, which is dangerous because of the non-standard binary names (e.g. MODFLOW-NWT_x64, MODFLOWNWT, mfnwt, etc). Default is None.

None
build_prior `bool`

flag to build prior covariance matrix. Default is True

True
sfr_obs `bool`

flag to include observations of flow and aquifer exchange from the sfr ASCII output file

False
hfb_pars `bool`

add HFB parameters. uses pyemu.gw_utils.write_hfb_template(). the resulting HFB pars have parval1 equal to the values in the original file and use the spatial_list_geostruct to build geostatistical covariates between parameters

False
kl_props [[`str`,[`int`]]]

karhunen-loeve based multiplier parameters. A nested list of KL-based model properties to parameterize using name, iterable pairs. For 3D properties, the iterable is zero-based layer indices (e.g., ["lpf.hk",[0,1,2,]] would setup a multiplier parameter for layer property file horizontal hydraulic conductivity for model layers 1,2, and 3 for unique zone values in the ibound array. For time-varying properties (e.g. recharge), the iterable is for zero-based stress period indices. For example, ["rch.rech",[0,4,10,15]] would setup zone-based multiplier parameters for recharge for stress period 1,5,11,and 16.

None
kl_num_eig `int`

the number of KL-based eigenvector multiplier parameters to use for each KL parameter set. default is 100

100
kl_geostruct `pyemu.geostats.Geostruct`

the geostatistical structure to build the prior parameter covariance matrix elements for KL-based parameters. If None, a generic GeoStruct is created using an "a" parameter that is 10 times the max cell size. Default is None

None

Note:

Setup up multiplier parameters for an existing MODFLOW model.

Does all kinds of coolness like building a
meaningful prior, assigning somewhat meaningful parameter groups and
bounds, writes a forward_run.py script with all the calls need to
implement multiplier parameters, run MODFLOW and post-process.

While this class does work, the newer `PstFrom` class is a more pythonic
implementation

build_prior(fmt='ascii', filename=None, droptol=None, chunk=None, sigma_range=6)

build and optionally save the prior parameter covariance matrix.

Parameters:

Name Type Description Default
fmt `str`

the format to save the cov matrix. Options are "ascii","binary","uncfile", "coo". Default is "ascii". If "none" (lower case string, not None), then no file is created.

'ascii'
filename `str`

the filename to save the prior cov matrix to. If None, the name is formed using model nam_file name. Default is None.

None
droptol `float`

tolerance for dropping near-zero values when writing compressed binary. Default is None.

None
chunk `int`

chunk size to write in a single pass - for binary only. Default is None (no chunking).

None
sigma_range `float`

number of standard deviations represented by the parameter bounds. Default is 6.

6

Returns:

Type Description

pyemu.Cov: the full prior parameter covariance matrix, generated by processing parameters by

groups

build_pst(filename=None)

build the pest control file using the parameters and observations.

Parameters:

Name Type Description Default
filename `str`

the filename to save the control file to. If None, the name if formed from the model namfile name. Default is None. The control is saved in the PstFromFlopy.m.model_ws directory.

None

Note:

calls pyemu.Pst.from_io_files

calls PESTCHEK

draw(num_reals=100, sigma_range=6, use_specsim=False, scale_offset=True)

draw from the geostatistically-implied parameter covariance matrix

Parameters:

Name Type Description Default
num_reals `int`

number of realizations to generate. Default is 100

100
sigma_range `float`

number of standard deviations represented by the parameter bounds. Default is 6.

6
use_specsim `bool`

flag to use spectral simulation for grid-based parameters. Requires a regular grid but is wicked fast. Default is False

False
scale_offset `bool`

flag to apply scale and offset to parameter bounds when calculating variances - this is passed through to pyemu.Cov.from_parameter_data. Default is True.

True
Note

operates on parameters by groups to avoid having to construct a very large covariance matrix for problems with more the 30K parameters.

uses helpers.geostatitical_draw()

Returns:

Type Description

pyemu.ParameterEnsemble: The realized parameter ensemble

write_forward_run()

write the forward run script forward_run.py

Note

This method can be called repeatedly, especially after any changed to the pre- and/or post-processing routines.

apply_list_pars()

a function to apply boundary condition multiplier parameters.

Note

Used to implement the parameterization constructed by PstFromFlopyModel during a forward run

Requires either "temporal_list_pars.csv" or "spatial_list_pars.csv"

Should be added to the forward_run.py script (called programmaticlly by the PstFrom forward run script)

apply_temporal_diff_obs(config_file)

process an instruction-output file pair and formulate difference observations.

Parameters:

Name Type Description Default
config_file `str`

configuration file written by pyemu.helpers.setup_temporal_diff_obs.

required

Returns:

Type Description

diff_df (pandas.DataFrame) : processed difference observations

Note

Writes config_file.replace(".config",".processed") output file that can be read with the instruction file that is created by pyemu.helpers.setup_temporal_diff_obs().

This is the companion function of helpers.setup_setup_temporal_diff_obs().

setup_temporal_diff_obs(pst, ins_file, out_file=None, include_zero_weight=False, include_path=False, sort_by_name=True, long_names=True, prefix='dif')

a helper function to setup difference-in-time observations based on an existing set of observations in an instruction file using the observation grouping in the control file

Parameters:

Name Type Description Default
pst `pyemu.Pst`

existing control file

required
ins_file `str`

an existing instruction file

required
out_file `str`

an existing model output file that corresponds to the instruction file. If None, ins_file.replace(".ins","") is used

None
include_zero_weight `bool`

flag to include zero-weighted observations in the difference observation process. Default is False so that only non-zero weighted observations are used.

False
include_path `bool`

flag to setup the binary file processing in directory where the hds_file is located (if different from where python is running). This is useful for setting up the process in separate directory for where python is running.

False
sort_by_name `bool`,optional

flag to sort observation names in each group prior to setting up the differencing. The order of the observations matters for the differencing. If False, then the control file order is used. If observation names have a datetime suffix, make sure the format is year-month-day to use this sorting. Default is True

True
long_names `bool`

flag to use long, descriptive names by concatenating the two observation names that are being differenced. This will produce names that are too long for traditional PEST(_HP). Default is True.

True
prefix `str`

prefix to prepend to observation names and group names. Default is "dif".

'dif'

Returns:

Type Description

tuple containing

  • str: the forward run command to execute the binary file process during model runs.
  • pandas.DataFrame: a dataframe of observation information for use in the pest control file
Note

This is the companion function of helpers.apply_temporal_diff_obs().

write_const_tpl(name, tpl_file, suffix, zn_array=None, shape=None, longnames=False)

write a constant (uniform) template file for a 2-D array

Parameters:

Name Type Description Default
name `str`

the base parameter name

required
tpl_file `str`

the template file to write

required
zn_array `numpy.ndarray`

an array used to skip inactive cells, and optionally get shape info.

None
shape `tuple`

tuple nrow and ncol. Either zn_array or shape must be passed

None
longnames `bool`

flag to use longer names that exceed 12 chars in length. Default is False.

False

Returns:

Type Description

pandas.DataFrame: a dataframe with parameter information

Note

This function is used during the PstFrom setup process

write_grid_tpl(name, tpl_file, suffix, zn_array=None, shape=None, spatial_reference=None, longnames=False)

write a grid-based template file for a 2-D array

Parameters:

Name Type Description Default
name `str`

the base parameter name

required
tpl_file `str`

the template file to write - include path

required
zn_array `numpy.ndarray`

zone array to identify inactive cells. Default is None

None
shape `tuple`

a length-two tuple of nrow and ncol. Either zn_array or shape must be passed.

None
spatial_reference `flopy.utils.SpatialReference`

a spatial reference instance. If longnames is True, then spatial_reference is used to add spatial info to the parameter names.

None
longnames `bool`

flag to use longer names that exceed 12 chars in length. Default is False.

False

Returns:

Type Description

pandas.DataFrame: a dataframe with parameter information

Note

This function is used during the PstFrom setup process

Example::

pyemu.helpers.write_grid_tpl("hk_layer1","hk_Layer_1.ref.tpl","gr",
                             zn_array=ib_layer_1,shape=(500,500))

write_zone_tpl(name, tpl_file, suffix='', zn_array=None, shape=None, longnames=False, fill_value='1.0')

write a zone-based template file for a 2-D array

Parameters:

Name Type Description Default
name `str`

the base parameter name

required
tpl_file `str`

the template file to write

required
suffix `str`

suffix to add to parameter names. Only used if longnames=True

''
zn_array `numpy.ndarray`

an array used to skip inactive cells, and optionally get shape info. zn_array values less than 1 are given fill_value

None
shape `tuple`

tuple nrow and ncol. Either zn_array or shape must be passed

None
longnames `bool`

flag to use longer names that exceed 12 chars in length. Default is False.

False
fill_value `str`

value to fill locations where zn_array is zero or less. Default is "1.0".

'1.0'

Returns:

Type Description

pandas.DataFrame: a dataframe with parameter information

Note

This function is used during the PstFrom setup process