pls
Partial Least Squares (PLS) regression emulator.
PLS finds a small set of latent factors that maximise covariance between an
input matrix X (parameters) and an output matrix Y (observations). It is a
natural fit for surrogate problems where the parameter dimension d is
larger than the training set size n, and where the outputs are correlated
multivariate quantities.
The class mirrors the surface area of the other pyemu emulators (DSI,
DSIAE, GPR): same fit/predict shape, the same transformer
pipeline plumbing for optional input transforms, and the same
prepare_pestpp hook inherited from :class:Emulator so a fitted
emulator can be used as a PEST++ forward run.
PLS
Bases: Emulator
Partial Least Squares regression emulator.
Parameters
pst : Pst, optional
PEST control-file object. Source for observation_data during
prepare_pestpp; also used to infer input_names /
output_names from data when those are not provided.
data : pandas.DataFrame
Joint training DataFrame containing both the input columns (named in
input_names) and the output columns (named in output_names).
Columns in data outside those two name lists are ignored. The
caller is responsible for any non-zero-weight subsetting or other
preprocessing — PLS treats whatever columns it is told to.
input_names : list of str, optional
Columns in data that are the regression inputs (parameters). May
be omitted — see output_names for inference rules.
output_names : list of str, optional
Columns in data that are the regression outputs (observations).
Resolution rules:
* Both lists passed — used as-is.
* Only one passed — the other is ``set(data.columns) - set(passed)``.
* Neither passed (requires ``pst``) — if ``data`` contains both pst
pars and pst obs, pars are inputs and obs are outputs; if ``data``
contains only pst obs, nonzero-weight obs are inputs and
zero-weight obs are outputs (the DSI-style "obs-as-pars" setup).
transforms : list of dict, optional
Feature transformations applied via the base-class transformer
pipeline. Same format as :class:DSIAE.
n_components : int, optional
Number of PLS latent factors. If None (default), the value is
chosen by k-fold cross-validation on the training data.
cv_folds : int, default 5
Number of folds used when n_components is selected by CV.
parameter_reducer : sklearn-style transformer, optional
Optional dimension-reducer fit on the input matrix before PLS (e.g.
sklearn.decomposition.PCA or
sklearn.random_projection.GaussianRandomProjection). Must
implement fit_transform and transform. If left as None
and the input dimension exceeds :data:HIGH_D_WARN_THRESHOLD, a
warning is emitted suggesting one — but PLS is still trained on the
full input.
verbose : bool, default False
Enable verbose logging.
encode(X)
Project new inputs into PLS latent space (X-scores).
fit()
Fit the PLS regression on the (optionally transformed) training data.
load(filename)
classmethod
Load a fitted emulator from a file.
Parameters
filename : str Path to the saved emulator file.
Returns
Emulator The loaded emulator instance.
predict(pvals)
Predict outputs from input parameter values.
Returns a Series for a single-row input (matching the DSIAE/DSI convention used by the PEST++ forward-run helper) and a DataFrame for multi-row input.
prepare_pestpp(t_d, pst=None, verbose=False, **kwargs)
Generic method to prepare a PEST++ interface for the emulator.
This method automates the creation of template files, instruction files, control files, and the forward run script needed to run the emulator within a PEST++ workflow (e.g. IES).
Parameters
t_d : str Path to the template directory where files will be written. pst : Pst, optional A Pst object representing the original control file. Useful for scraping constraint weights, observation lists, etc. Subclasses may use this to determine specific parameters or observations. verbose : bool Enable verbose logging.
Returns
Pst The generated Pst object for the emulator.
save(filename)
Save the fitted emulator to a file.
Parameters
filename : str Path to save the emulator.
pls_file_forward_run(emu_file='pls.pickle', input_file='pls_pars.csv', output_file='pls_sim_vals.csv')
File-based forward-run helper for a fitted PLS emulator.
Loads the pickled emulator, reads parameter values from input_file,
calls emu.predict, and writes the resulting observation values to
output_file. Used when prepare_pestpp(use_runstor=False).
pls_runstore_forward_run(ws='.', pst_name='pls', emu_file='emulator.pkl')
Runstor-based forward-run helper for a fitted PLS emulator.
PESTPP-IES in panther / external run-manager mode (/e) reads/writes
realisations through a binary RunStor ({pst_name}.rns) rather than via
CSV files. The plain file-based helper never sees the rns and so the
obs columns stay zero-filled -- that's the failure mode this function
fixes. Mirror of :func:pyemu.utils.helpers.dsi_runstore_forward_run.
NOTE: this function's source is embedded verbatim in the generated forward_run.py (via inspect.getsource), so it must stay ASCII-only -- on Windows the script is read back with whatever the locale encoding is, and non-ASCII bytes break the UTF-8 source parse.