Skip to content

pls

Partial Least Squares (PLS) regression emulator.

PLS finds a small set of latent factors that maximise covariance between an input matrix X (parameters) and an output matrix Y (observations). It is a natural fit for surrogate problems where the parameter dimension d is larger than the training set size n, and where the outputs are correlated multivariate quantities.

The class mirrors the surface area of the other pyemu emulators (DSI, DSIAE, GPR): same fit/predict shape, the same transformer pipeline plumbing for optional input transforms, and the same prepare_pestpp hook inherited from :class:Emulator so a fitted emulator can be used as a PEST++ forward run.

PLS

Bases: Emulator

Partial Least Squares regression emulator.

Parameters

pst : Pst, optional PEST control-file object. Source for observation_data during prepare_pestpp; also used to infer input_names / output_names from data when those are not provided. data : pandas.DataFrame Joint training DataFrame containing both the input columns (named in input_names) and the output columns (named in output_names). Columns in data outside those two name lists are ignored. The caller is responsible for any non-zero-weight subsetting or other preprocessing — PLS treats whatever columns it is told to. input_names : list of str, optional Columns in data that are the regression inputs (parameters). May be omitted — see output_names for inference rules. output_names : list of str, optional Columns in data that are the regression outputs (observations). Resolution rules:

* Both lists passed — used as-is.
* Only one passed — the other is ``set(data.columns) - set(passed)``.
* Neither passed (requires ``pst``) — if ``data`` contains both pst
  pars and pst obs, pars are inputs and obs are outputs; if ``data``
  contains only pst obs, nonzero-weight obs are inputs and
  zero-weight obs are outputs (the DSI-style "obs-as-pars" setup).

transforms : list of dict, optional Feature transformations applied via the base-class transformer pipeline. Same format as :class:DSIAE. n_components : int, optional Number of PLS latent factors. If None (default), the value is chosen by k-fold cross-validation on the training data. cv_folds : int, default 5 Number of folds used when n_components is selected by CV. parameter_reducer : sklearn-style transformer, optional Optional dimension-reducer fit on the input matrix before PLS (e.g. sklearn.decomposition.PCA or sklearn.random_projection.GaussianRandomProjection). Must implement fit_transform and transform. If left as None and the input dimension exceeds :data:HIGH_D_WARN_THRESHOLD, a warning is emitted suggesting one — but PLS is still trained on the full input. verbose : bool, default False Enable verbose logging.

encode(X)

Project new inputs into PLS latent space (X-scores).

fit()

Fit the PLS regression on the (optionally transformed) training data.

load(filename) classmethod

Load a fitted emulator from a file.

Parameters

filename : str Path to the saved emulator file.

Returns

Emulator The loaded emulator instance.

predict(pvals)

Predict outputs from input parameter values.

Returns a Series for a single-row input (matching the DSIAE/DSI convention used by the PEST++ forward-run helper) and a DataFrame for multi-row input.

prepare_pestpp(t_d, pst=None, verbose=False, **kwargs)

Generic method to prepare a PEST++ interface for the emulator.

This method automates the creation of template files, instruction files, control files, and the forward run script needed to run the emulator within a PEST++ workflow (e.g. IES).

Parameters

t_d : str Path to the template directory where files will be written. pst : Pst, optional A Pst object representing the original control file. Useful for scraping constraint weights, observation lists, etc. Subclasses may use this to determine specific parameters or observations. verbose : bool Enable verbose logging.

Returns

Pst The generated Pst object for the emulator.

save(filename)

Save the fitted emulator to a file.

Parameters

filename : str Path to save the emulator.

pls_file_forward_run(emu_file='pls.pickle', input_file='pls_pars.csv', output_file='pls_sim_vals.csv')

File-based forward-run helper for a fitted PLS emulator.

Loads the pickled emulator, reads parameter values from input_file, calls emu.predict, and writes the resulting observation values to output_file. Used when prepare_pestpp(use_runstor=False).

pls_runstore_forward_run(ws='.', pst_name='pls', emu_file='emulator.pkl')

Runstor-based forward-run helper for a fitted PLS emulator.

PESTPP-IES in panther / external run-manager mode (/e) reads/writes realisations through a binary RunStor ({pst_name}.rns) rather than via CSV files. The plain file-based helper never sees the rns and so the obs columns stay zero-filled -- that's the failure mode this function fixes. Mirror of :func:pyemu.utils.helpers.dsi_runstore_forward_run.

NOTE: this function's source is embedded verbatim in the generated forward_run.py (via inspect.getsource), so it must stay ASCII-only -- on Windows the script is read back with whatever the locale encoding is, and non-ASCII bytes break the UTF-8 source parse.