pst_from
PstFrom
Bases: object
construct high-dimensional PEST(++) interfaces with all the bells and whistles
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
original_d
|
`str` or Path
|
the path to a complete set of model input and output files |
required |
new_d
|
`str` or Path
|
the path to where the model files and PEST interface files will be copied/built |
required |
longnames
|
`bool`
|
flag to use longer-than-PEST-likes parameter and observation names. Default is True |
True
|
remove_existing
|
`bool`
|
flag to destroy any existing files and folders
in |
False
|
spatial_reference
|
varies
|
an object that facilitates geo-locating model cells based on index. Default is None |
None
|
zero_based
|
`bool`
|
flag if the model uses zero-based indices, Default is True |
True
|
start_datetime
|
`str` or Timestamp
|
a string that can be case to a datatime instance the represents the starting datetime of the model |
None
|
tpl_subfolder
|
`str`
|
option to write template files to a subfolder
within |
None
|
chunk_len
|
`int`
|
the size of each "chunk" of files to spawn a |
50
|
echo
|
`bool`
|
flag to echo logger messages to the screen. Default is True |
True
|
pp_solve_num_threads
|
`int`
|
number of threads to use for the pyemu very-slow kriging solve for pilot-point type parameters. Default is 10. |
10
|
Note
This is the way...
Example::
pf = PstFrom("path_to_model_files","new_dir_with_pest_stuff",start_datetime="1-1-2020")
pf.add_parameters("hk.dat")
pf.add_observations("heads.csv")
pf.build_pst("pest.pst")
pe = pf.draw(100)
pe.to_csv("prior.csv")
parfile_relations
property
build up a container of parameter file information. Called programmatically...
add_observations(filename, insfile=None, index_cols=None, use_cols=None, use_rows=None, prefix='', ofile_skip=None, ofile_sep=None, rebuild_pst=False, obsgp=None, zone_array=None, includes_header=True)
Add values in output files as observations to PstFrom object
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
model output file name(s) to set up
as observations. By default filename should give relative
location from top level of pest template directory
( |
required |
insfile
|
`str`
|
desired instructions file filename |
None
|
index_cols
|
`list`-like or `int`
|
columns to denote are indices for obs |
None
|
use_cols
|
`list`-like or `int`
|
columns to set up as obs. If None,
and |
None
|
use_rows
|
`list`-like or `int`
|
select only specific row of file for obs |
None
|
prefix
|
`str`
|
prefix for obsnmes |
''
|
ofile_skip
|
`int`
|
number of lines to skip in model output file |
None
|
ofile_sep
|
`str`
|
delimiter in output file.
If |
None
|
rebuild_pst
|
`bool`
|
(Re)Construct PstFrom.pst object after adding new obs |
False
|
obsgp
|
`str` of `list`-like
|
observation group name(s). If type
|
None
|
zone_array
|
`np.ndarray`
|
array defining spatial limits or zones for array-style observations. Default is None |
None
|
includes_header
|
`bool`
|
flag indicating that the list-style file includes a header row. Default is True. |
True
|
Returns:
| Type | Description |
|---|---|
|
|
Note
This is the main entry for adding observations to the pest interface
If index_cols and use_cols are both None, then it is assumed that
array-style observations are being requested. In this case,
filenames must be only one filename.
zone_array is only used for array-style observations. Zone values
less than or equal to zero are skipped (using the "dum" option)
Example::
# setup observations for the 2nd thru 5th columns of the csv file
# using the first column as the index
df = pf.add_observations("heads.csv",index_col=0,use_cols=[1,2,3,4],
ofile_sep=",")
# add array-style observations, skipping model cells with an ibound
# value less than or equal to zero
df = pf.add_observations("conce_array.dat,index_col=None,use_cols=None,
zone_array=ibound)
add_observations_from_ins(ins_file, out_file=None, pst_path=None, inschek=True)
add new observations to a control file from an existing instruction file
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ins_file
|
`str`
|
instruction file with exclusively new
observation names. N.B. if |
required |
out_file
|
`str`
|
model output file. If None, then
ins_file.replace(".ins","") is used. Default is None.
If |
None
|
pst_path
|
`str`
|
the path to append to the instruction file and
out file in the control file. If not None, then any existing
path in front of the template or ins file is split off and
pst_path is prepended. If python is being run in a directory
other than where the control file will reside, it is useful
to pass |
None
|
inschek
|
`bool`
|
flag to try to process the existing output file
using the |
True
|
Returns:
| Type | Description |
|---|---|
|
|
Note
populates the new observation information with default values
Example::
pf = pyemu.PstFrom("temp","template")
pf.add_observations_from_ins(os.path.join("template","new_obs.dat.ins"),
pst_path=".")
add_parameters(filenames, par_type, zone_array=None, dist_type='gaussian', sigma_range=4.0, upper_bound=None, lower_bound=None, transform=None, par_name_base='p', index_cols=None, use_cols=None, use_rows=None, pargp=None, pp_space=None, use_pp_zones=None, num_eig_kl=100, spatial_reference=None, geostruct=None, datetime=None, mfile_fmt='free', mfile_skip=None, mfile_sep=None, ult_ubound=None, ult_lbound=None, rebuild_pst=False, alt_inst_str='inst', comment_char=None, par_style='multiplier', initial_value=None, pp_options=None, apply_order=999, apply_function=None)
Add list or array style model input files to PstFrom object. This method is the main entry point for adding parameters to the pest interface
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filenames
|
`str`
|
Model input filenames to parameterize. By default filename should give relative
location from top level of pest template directory
( |
required |
par_type
|
`str`
|
One of |
required |
zone_array
|
`np.ndarray`
|
array defining spatial limits or zones for parameterization. |
None
|
dist_type
|
not yet implemented # TODO |
'gaussian'
|
|
sigma_range
|
not yet implemented # TODO |
4.0
|
|
upper_bound
|
`float`
|
PEST parameter upper bound. If |
None
|
lower_bound
|
`float`
|
PEST parameter lower bound. If |
None
|
transform
|
`str`
|
PEST parameter transformation. Must be either "log","none" or "fixed. The "tied" transform
must be used after calling |
None
|
par_name_base
|
`str` or `list`-like
|
basename for parameters that
are set up. If parameter file is tabular list-style file
( |
'p'
|
index_cols
|
`list`-like
|
if not None, will attempt to parameterize
expecting a tabular-style model input file. |
None
|
use_cols
|
`list`-like or `int`
|
for tabular-style model input file, defines the columns to be parameterised |
None
|
use_rows
|
`list` or `tuple`
|
Setup parameters for
only specific rows in list-style model input file.
Action is dependent on the the dimensions of use_rows.
If ndim(use_rows) < 2: use_rows is assumed to represent the row number, index slicer (equiv df.iloc),
for all passed files (after headers stripped). So use_rows=[0,3,5], will parameterise the
1st, 4th and 6th rows of each passed list-like file.
If ndim(use_rows) = 2: use_rows represent the index value to parameterise according to index_cols.
e.g. [(3,5,6)] or [[3,5,6]] would attempt to set parameters where the model file
values for 3 |
None
|
pargp
|
`str`
|
Parameter group to assign pars to. This is PESTs
pargp but is also used to gather correlated parameters set up
using multiple |
None
|
pp_space
|
`float`, `int`,`str` or `pd.DataFrame`
|
Spatial pilot point information. DEPRECATED : use pp_options['pp_space'] instead. |
None
|
use_pp_zones
|
`bool`
|
a flag to use the greater-than-zero values DEPRECATED : use pp_options['use_pp_zones'] instead. |
None
|
num_eig_kl
|
TODO - implement with KL pars |
100
|
|
spatial_reference
|
`pyemu.helpers.SpatialReference`
|
If different
spatial reference required for pilotpoint setup.
If None spatial reference passed to |
None
|
geostruct
|
`pyemu.geostats.GeoStruct()`
|
For specifying correlation geostruct for pilot-points and par covariance. |
None
|
datetime
|
`str`
|
optional %Y%m%d string or datetime object for setting up temporally correlated pars. Where datetime is passed correlation axis for pars will be set to timedelta. |
None
|
mfile_fmt
|
`str`
|
format of model input file - this will be preserved |
'free'
|
mfile_skip
|
`int` or `str`
|
header in model input file to skip
when reading and reapply when writing. Can optionally be |
None
|
mfile_sep
|
`str`
|
separator/delimiter in model input file.
If None, separator will be interpreted from file name extension.
|
None
|
ult_ubound
|
`float`
|
Ultimate upper bound for model input parameter once all mults are applied - ensure physical model par vals. If not passed, it is set to 1.0e+30 |
None
|
ult_lbound
|
`float`
|
Ultimate lower bound for model input parameter once all mults are applied. If not passed, it is set to 1.0e-30 for log transform and -1.0e+30 for non-log transform |
None
|
rebuild_pst
|
`bool`
|
(Re)Construct PstFrom.pst object after adding new parameters |
False
|
alt_inst_str
|
`str`
|
Alternative to default |
'inst'
|
comment_char
|
`str`
|
option to skip comment lines in model file.
This is not additive with |
None
|
par_style
|
`str`
|
either "m"/"mult"/"multiplier", "a"/"add"/"addend", or "d"/"direct" where the former sets up a multiplier and addend parameters process against the existing model input array and the former sets up a template file to write the model input file directly. Default is "multiplier". |
'multiplier'
|
initial_value
|
`float`
|
the value to set for the |
None
|
pp_options
|
`dict`
|
Various options to control pilot point options. Can include:
If If If If If
|
None
|
apply_order
|
`int`
|
the optional order to process this set of parameters at runtime. Default is 999. |
999
|
apply_function
|
`str`
|
a python function to call during the apply process at runtime. Default is None. |
None
|
Returns:
pandas.DataFrame: dataframe with info for new parameters
Example::
# setup grid-scale direct parameters for an array of numbers
df = pf.add_parameters("hk.dat",par_type="grid",par_style="direct")
# setup pilot point multiplier parameters for an array of numbers
# with a pilot point being set in every 5th active model cell
df = pf.add_parameters("recharge.dat",par_type="pilotpoint",pp_space=5,
zone_array="ibound.dat")
# setup a single multiplier parameter for the 4th column
# of a column format (list/tabular type) file
df = pf.add_parameters("wel_list_1.dat",par_type="constant",
index_cols=[0,1,2],use_cols=[3])
add_py_function(file_name, call_str=None, is_pre_cmd=True, function_name=None)
add a python function to the forward run script
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_name
|
`str` or `callable`
|
a python source file or function/callable |
required |
call_str
|
`str`
|
the call string for python function in
|
None
|
is_pre_cmd
|
`bool` or `None`
|
flag to include |
True
|
function_name
|
`str`
|
DEPRECATED, used |
None
|
Returns: None
Note
call_str is expected to reference standalone a function
that contains all the imports it needs or these imports
should have been added to the forward run script through the
PstFrom.extra_py_imports list.
This function adds the call_str call to the forward
run script (either as a pre or post command or function not
directly called by main). It is up to users
to make sure call_str is a valid python function call
that includes the parentheses and requisite arguments
This function expects "def " + function_name to be flushed left
at the outer most indentation level
Example::
pf = PstFrom()
# add the function "mult_well_function" from the script file "preprocess.py" as a
# command to run before the model is run
pf.add_py_function("preprocess.py",
"mult_well_function(arg1='userarg')",
is_pre_cmd = True)
# add the post processor function "made_it_good" from the script file "post_processors.py"
pf.add_py_function("post_processors.py","make_it_good()",is_pre_cmd=False)
# add the function "another_func" from the script file "utils.py" as a
# function not called by main
pf.add_py_function("utils.py","another_func()",is_pre_cmd=None)
build_prior(fmt='ascii', filename=None, droptol=None, chunk=None, sigma_range=6)
Build the prior parameter covariance matrix
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fmt
|
`str`
|
the file format to save to. Default is "ASCII", can be "binary", "coo", or "none" |
'ascii'
|
filename
|
`str`
|
the filename to save the cov to |
None
|
droptol
|
`float`
|
absolute value of prior cov entries that are smaller than |
None
|
chunk
|
`int`
|
number of entries to write to binary/coo at once. Default is None (write all elements at once |
None
|
sigma_range
|
`int`
|
number of standard deviations represented by parameter bounds. Default is 6 (99% confidence). 4 would be approximately 95% confidence bounds |
6
|
Returns:
| Type | Description |
|---|---|
|
|
Note
This method processes parameters by group names
For really large numbers of parameters (>30K), this method
will cause memory errors. Luckily, in most cases, users
only want this matrix to generate a prior parameter ensemble
and the PstFrom.draw() is a better choice...
build_pst(filename=None, update=False, version=1)
Build control file from i/o files in PstFrom object.
Warning: This builds a pest control file from scratch, overwriting
anything already in self.pst object and anything already written to filename
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
`str`
|
the filename to save the control file to.
If None, the name is formed from the |
None
|
update
|
`bool`) or (str
|
flag to add to existing Pst object and rewrite. If string {'pars', 'obs'} just update respective components of Pst. Default is False - build from PstFrom components. |
False
|
version
|
`int`
|
control file version to write, Default is 1. If None, option to not write pst to file at pst_build() call -- handy when control file is huge pst object will be modified again before running. |
1
|
Note:
This builds a pest control file from scratch, overwriting anything already
in self.pst object and anything already written to filename
The new pest control file is assigned an NOPTMAX value of 0
draw(num_reals=100, sigma_range=6, use_specsim=False, scale_offset=True, rng=None)
Draw a parameter ensemble from the distribution implied by the initial parameter values in the control file and the prior parameter covariance matrix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_reals
|
`int`
|
the number of realizations to draw |
100
|
sigma_range
|
`int`
|
number of standard deviations represented by parameter bounds. Default is 6 (99% confidence). 4 would be approximately 95% confidence bounds |
6
|
use_specsim
|
`bool`
|
flag to use spectral simulation for grid-scale pars (highly recommended). Default is False |
False
|
scale_offset
|
`bool`
|
flag to apply scale and offset to parameter bounds before calculating prior variance. Dfault is True. If you are using non-default scale and/or offset and you get an exception during draw, try changing this value to False. |
True
|
rng
|
`numpy.random.RandomState`
|
random number generator if not using default from pyemu.en |
None
|
Returns:
| Type | Description |
|---|---|
|
|
Note
This method draws by parameter group
If you are using grid-style parameters, please use spectral simulation (use_specsim=True)
initialize_spatial_reference()
process the spatial reference argument. Called programmatically
parse_kij_args(args, kwargs)
parse args into kij indices. Called programmatically
write_forward_run()
write the forward run script. Called by build_pst()
get_filepath(folder, filename)
Return a path to a file within a folder, without repeating the folder in the output path, if the input filename (path) already contains the folder.
get_relative_filepath(folder, filename)
Like :func:~pyemu.utils.pst_from.get_filepath, except
return path for filename relative to folder.
write_array_tpl(name, tpl_filename, suffix, par_type, data_array=None, zone_array=None, gpname=None, fill_value=1.0, get_xy=None, input_filename=None, par_style='m', headerlines=None)
write a template file for a 2D array.
Args:
name (str): the base parameter name
tpl_filename (str): the template file to write - include path
suffix (str): suffix to append to par names
par_type (str): type of parameter
data_array (numpy.ndarray): original data array
zone_array (numpy.ndarray): an array used to skip inactive cells. Values less than 1 are
not parameterized and are assigned a value of fill_value. Default is None.
gpname (str): pargp filed in dataframe
fill_value:
get_xy:
input_filename:
par_style (str): either 'd','a', or 'm'
Returns:
| Name | Type | Description |
|---|---|---|
df |
`pandas.DataFrame`
|
a dataframe with parameter information |
Note
This function is called by PstFrom programmatically
write_list_tpl(filenames, dfs, name, tpl_filename, index_cols, par_type, use_cols=None, use_rows=None, suffix='', zone_array=None, gpname=None, get_xy=None, ij_in_idx=None, xy_in_idx=None, zero_based=True, input_filename=None, par_style='m', headerlines=None, fill_value=1.0, logger=None)
Write template files for a list style input.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filenames
|
`str` of `container` of `str`
|
original input filenames |
required |
dfs
|
`pandas.DataFrame` or `container` of pandas.DataFrames
|
pandas representations of input file. |
required |
name
|
`str` or container of str
|
parameter name prefixes. If more that one column to be parameterised, must be a container of strings providing the prefix for the parameters in the different columns. |
required |
tpl_filename
|
`str`
|
Path (from current execution directory) for desired template file |
required |
index_cols
|
`list`
|
column names to use as indices in tabular input dataframe |
required |
par_type
|
`str`
|
'constant','zone', or 'grid' used in parname
generation. If |
required |
use_cols
|
`list`
|
Columns in tabular input file to paramerterise. If None, pars are set up for all columns apart from index cols. |
None
|
use_rows
|
`list` of `int` or `tuple`
|
Setup parameters for only
specific rows in list-style model input file.
If list of |
None
|
suffix
|
`str`
|
Optional par name suffix |
''
|
zone_array
|
`np.ndarray`
|
Array defining zone divisions.
If not None and |
None
|
get_xy
|
`pyemu.PstFrom` method
|
Can be specified to get real-world xy
from |
None
|
ij_in_idx
|
`list` or `array`
|
defining which |
None
|
xy_in_idx
|
`list` or `array`
|
defining which |
None
|
zero_based
|
`boolean`
|
IMPORTANT - pass as False if |
True
|
input_filename
|
`str`
|
Path to input file (paired with tpl file) |
None
|
par_style
|
`str`
|
either 'd','a', or 'm' |
'm'
|
headerlines
|
[`str`]
|
optional header lines in the original model file, used for direct style parameters |
None
|
Returns:
pandas.DataFrame: dataframe with info for the new parameters
Note
This function is called by PstFrom programmatically