emulators

`AutobotsAssemble`

Class for transforming features in a DataFrame using a pipeline approach.

`apply(transform_type, columns=None, **kwargs)`

Apply a transformation to specified columns.

`inverse(df=None)`

Apply inverse transformations in reverse order.

`inverse_on_external_df(df, columns=None)`

Apply inverse transformations to an external DataFrame.

Parameters

df : pandas.DataFrame The DataFrame to inverse transform. columns : list, optional Specific columns to inverse transform. If None, all columns are processed.

Returns

pandas.DataFrame The inverse-transformed DataFrame.

`transform(df)`

Transform an external DataFrame using the pipeline.

Parameters

df : pandas.DataFrame The DataFrame to transform.

Returns

pandas.DataFrame The transformed DataFrame.

`BaseTransformer`

Base class for all transformers providing a consistent interface.

`fit(X)`

Learn parameters from data if needed.

`fit_transform(X)`

Fit and transform in one step.

`inverse_transform(X)`

Inverse transform X back to original space.

`transform(X)`

Apply transformation to X.

`DSI`

Bases: Emulator

Data Space Inversion (DSI) emulator class. Based on DSI as described in Sun & Durlofsky (2017) and Sun et al (2017).

`init(pst=None, data=None, transforms=None, energy_threshold=1.0, rowwise_groups=None, rowwise_fit_groups=None, feature_range=(-1, 1), svd_solver='full', n_components=None, n_iter=4, random_state=None, verbose=False)`

Initialize the DSI emulator.

If rowwise_groups is provided, training data are row-wise scaled per-group before SVD. Predictions are returned in scaled space and then inverse-scaled using per-row parameters derived from truth values found in pst.observation_data.

Parameters

pst : Pst, optional A Pst object. If provided, the emulator will be initialized with the information from the Pst object. data : DataFrame or ObservationEnsemble, optional An ensemble of simulated observations. If provided, the emulator will be initialized with the information from the ensemble. transforms : list of dict, optional List of transformation specifications. Each dict should have: - 'type': str - Type of transformation (e.g.,'log10', 'normal_score'). - 'columns': list of str,optional - Columns to apply the transformation to. If not supplied, transformation is applied to all columns. - Additional kwargs for the transformation (e.g., 'quadratic_extrapolation' for normal score transform). Example: transforms = [ {'type': 'log10', 'columns': ['obs1', 'obs2']}, {'type': 'normal_score', 'quadratic_extrapolation': True} ] Default is None, which means no transformations will be applied. energy_threshold : float, optional The energy threshold for the SVD. Default is 1.0, no truncation. Ignored when svd_solver='randomized' (truncation is fixed by n_components there). rowwise_groups : dict, optional Dictionary mapping groups to column lists for row-wise scaling. rowwise_fit_groups : dict, optional Dictionary mapping groups to column lists for fitting row-wise scalers. feature_range : tuple, optional Feature range for row-wise scaling. Default is (-1, 1). svd_solver : {'full', 'randomized'}, optional Which SVD driver to use in compute_projection_matrix: - 'full' (default): np.linalg.svd via LAPACK gesdd; computes all min(n_real, n_obs) singular triplets, then optionally energy-truncates. - 'randomized': sklearn.utils.extmath.randomized_svd; computes only the top n_components triplets directly. Much cheaper for tall/wide ensembles when only a few components are needed. Requires scikit-learn. n_components : int, optional Number of components to retain when svd_solver='randomized'. Required in that case; ignored otherwise. n_iter : int, optional Power-iteration count passed to randomized_svd. Default 4 (sklearn default). Higher values improve accuracy at the cost of more passes over the data. random_state : int or None, optional Seed for randomized_svd's random projection. Default None. verbose : bool, optional If True, enable verbose logging. Default is False.

`check_for_pdc()`

Check for Prior data conflict.

`compute_projection_matrix(energy_threshold=None)`

Compute the projection matrix using SVD.

Parameters

energy_threshold : float, optional Energy threshold for truncation. Default is None, which uses the threshold from initialization.

Returns

None

`fit()`

Fit the emulator to training data.

Parameters

self : DSI The DSI emulator instance.

Returns

self : DSI The fitted emulator.

`load(filename)` `classmethod`

Load a fitted emulator from a file.

Parameters

filename : str Path to the saved emulator file.

Returns

Emulator The loaded emulator instance.

`predict(pvals, pst=None)`

Generate predictions from the emulator.

Parameters

pvals : numpy.ndarray or pandas.Series Parameter values for prediction. pst : Pst, optional If provided (or if self.observation_data exists), used to obtain truth values for inverse row-wise scaling (if enabled).

Returns

pandas.Series Predicted observation values.

`prepare_dsivc(decvar_names, t_d=None, pst=None, oe=None, track_stack=False, dsi_args=None, percentiles=[0.25, 0.75, 0.5], mou_population_size=None, ies_exe_path='pestpp-ies')`

Prepare Data Space Inversion Variable Control (DSIVC) control files.

Parameters

decvar_names : list or str Names of decision variables. t_d : str, optional Template directory path. pst : Pst, optional PST control file object. oe : ObservationEnsemble, optional Observation ensemble. track_stack : bool, optional Whether to track the stack. Default is False. dsi_args : dict, optional Arguments for DSI. percentiles : list, optional Percentiles to calculate. Default is [0.25, 0.75, 0.5]. mou_population_size : int, optional Population size for multi-objective optimization. ies_exe_path : str, optional Path to the PEST++ IES executable. Default is "pestpp-ies". Returns

Pst PEST++ control file object for DSIVC.

`prepare_pestpp(t_d, observation_data=None, use_runstor=False, pst=None, verbose=False)`

Prepare PEST++ interface for DSI. Overrides base method to handle specific DSI arguments like use_runstor

`save(filename)`

Save the fitted emulator to a file.

Parameters

filename : str Path to save the emulator.

`Emulator`

Base class for emulators.

This class defines the common interface for all emulator implementations and provides shared functionality used by multiple emulator types.

`init(transforms=None, verbose=True)`

Initialize the Emulator base class.

Parameters

transforms : list of dict, optional List of transformation specifications. Each dict should have: - 'type': str - Type of transformation (e.g.,'log10', 'normal_score'). - 'columns': list of str,optional - Columns to apply the transformation to. If not supplied, transformation is applied to all columns. - Additional kwargs for the transformation (e.g., 'quadratic_extrapolation' for normal score transform). Example: transforms = [ {'type': 'log10', 'columns': ['obs1', 'obs2']}, {'type': 'normal_score', 'quadratic_extrapolation': True} ] Default is None, which means no transformations will be applied. verbose : bool, optional If True, enable verbose logging. Default is True.

`fit(X, y=None)`

Fit the emulator to training data.

Parameters

X : pandas.DataFrame Input features for training. y : pandas.DataFrame or None, optional Target values for training if separate from X.

Returns

self : Emulator Returns self for method chaining.

`load(filename)` `classmethod`

Load a fitted emulator from a file.

Parameters

filename : str Path to the saved emulator file.

Returns

Emulator The loaded emulator instance.

`predict(X)`

Generate predictions using the fitted emulator.

Parameters

X : pandas.DataFrame Input data to generate predictions for.

Returns

pandas.DataFrame or pandas.Series Predictions for the input data.

`prepare_pestpp(t_d, pst=None, verbose=False, **kwargs)`

Generic method to prepare a PEST++ interface for the emulator.

This method automates the creation of template files, instruction files, control files, and the forward run script needed to run the emulator within a PEST++ workflow (e.g. IES).

Parameters

t_d : str Path to the template directory where files will be written. pst : Pst, optional A Pst object representing the original control file. Useful for scraping constraint weights, observation lists, etc. Subclasses may use this to determine specific parameters or observations. verbose : bool Enable verbose logging.

Returns

Pst The generated Pst object for the emulator.

`save(filename)`

Save the fitted emulator to a file.

Parameters

filename : str Path to save the emulator.

`Log10Transformer`

Bases: BaseTransformer

Apply log10 transformation.

Parameters

columns : list, optional List of column names to be transformed. If None, all columns will be transformed.

`fit(X)`

Learn parameters from data if needed.

`fit_transform(X)`

Fit and transform in one step.

`NormalScoreTransformer`

Bases: BaseTransformer

A transformer for normal score transformation.

Parameters

tol : float, default=1e-7 Tolerance for convergence of the Monte-Carlo z-score generator. Only used when method='montecarlo'. max_samples : int, default=1000000 Maximum number of Monte-Carlo replicates. Only used when method='montecarlo'. quadratic_extrapolation : bool, default=False Whether to use quadratic extrapolation for values outside the fitted range. columns : list, optional List of column names to be transformed. If None, all columns will be transformed. method : {'blom', 'montecarlo'}, default='blom' How to estimate the expected order statistics E[Z_(i:n)] of N(0,1). - 'blom' (default): closed-form Blom plotting positions Phi^-1((i - 3/8) / (n + 1/4)). Fast, deterministic. The systematic bias at the extreme tails is small (~0.01–0.015 in absolute z, growing slowly with n) and negligible for typical DSI use. - 'montecarlo': the original iterative estimator — repeatedly draw n standard normals, sort, and average until the running mean stabilises to tol or max_samples is reached. Convergent to the true expectation but ~10^4–10^5x slower than 'blom'. Useful when extreme-tail accuracy matters or for cross-validation against the closed-form approximation.

`fit(X)`

Fit the transformer to the data.

`fit_transform(X)`

Fit and transform in one step.

`inverse_transform(X)`

Inverse transform data back to original space.

Parameters

X : pandas.DataFrame The DataFrame with transformed data to inverse transform.

Returns

pandas.DataFrame The inverse-transformed DataFrame.

`transform(X)`

Transform the data using normal score transformation.

Parameters

X : pandas.DataFrame The DataFrame to transform.

Returns

pandas.DataFrame The transformed DataFrame with normal scores.

`RowWiseMinMaxScaler`

Bases: BaseTransformer

Scale each row of a DataFrame to a specified range.

Parameters

feature_range : tuple (min, max), default=(-1, 1) The range to scale features into. groups : dict or None, default=None Dict mapping group names to lists of column names to be scaled together (entire timeseries for that group). If None, all columns will be treated as a single group. Example: {'group1': ['col1', 'col2'], 'group2': ['col3', 'col4']} fit_groups : dict or None, default=None Dict mapping group names to lists of column names (subset of groups) used to compute row-wise min and max. If None, defaults to using the same columns as in groups.

`fit(X)`

Compute row-wise min and max for each group.

Parameters

X : pandas.DataFrame The DataFrame to fit the scaler on.

Returns

self : object Returns self.

`fit_transform(X)`

Fit and transform in one step.

`inverse_transform(X)`

Inverse transform data back to the original scale.

Parameters

X : pandas.DataFrame The DataFrame to inverse transform.

Returns

pandas.DataFrame The inverse-transformed DataFrame.

`transform(X)`

Scale each row of data to the specified range.

Parameters

X : pandas.DataFrame The DataFrame to transform.

Returns

pandas.DataFrame The transformed DataFrame.

`TransformerPipeline`

Apply a sequence of transformers in order.

`add(transformer, columns=None)`

Add a transformer to the pipeline, optionally for specific columns.

`fit(X)`

Fit all transformers in the pipeline.

`fit_transform(X)`

Fit all transformers and transform data in one operation.

`inverse_transform(X)`

Apply inverse transformations in reverse order.

Parameters

X : pandas.DataFrame The DataFrame to inverse transform.

Returns

pandas.DataFrame The inverse-transformed DataFrame.

`transform(X)`

Transform data using all transformers in the pipeline.

Parameters

X : pandas.DataFrame The DataFrame to transform.

Returns

pandas.DataFrame The transformed DataFrame.

emulators

AutobotsAssemble

apply(transform_type, columns=None, **kwargs)

inverse(df=None)

inverse_on_external_df(df, columns=None)

Parameters

Returns

transform(df)

Parameters

Returns

BaseTransformer

fit(X)

fit_transform(X)

inverse_transform(X)

transform(X)

DSI

__init__(pst=None, data=None, transforms=None, energy_threshold=1.0, rowwise_groups=None, rowwise_fit_groups=None, feature_range=(-1, 1), svd_solver='full', n_components=None, n_iter=4, random_state=None, verbose=False)

Parameters

check_for_pdc()

compute_projection_matrix(energy_threshold=None)

Parameters

Returns

fit()

Parameters

Returns

load(filename) classmethod

Parameters

Returns

predict(pvals, pst=None)

Parameters

Returns

prepare_dsivc(decvar_names, t_d=None, pst=None, oe=None, track_stack=False, dsi_args=None, percentiles=[0.25, 0.75, 0.5], mou_population_size=None, ies_exe_path='pestpp-ies')

Parameters

prepare_pestpp(t_d, observation_data=None, use_runstor=False, pst=None, verbose=False)

save(filename)

Parameters

Emulator

__init__(transforms=None, verbose=True)

Parameters

fit(X, y=None)

Parameters

Returns

load(filename) classmethod

Parameters

Returns

predict(X)

Parameters

Returns

prepare_pestpp(t_d, pst=None, verbose=False, **kwargs)

Parameters

Returns

save(filename)

Parameters

Log10Transformer

Parameters

fit(X)

fit_transform(X)

NormalScoreTransformer

Parameters

fit(X)

fit_transform(X)

inverse_transform(X)

Parameters

Returns

transform(X)

Parameters

Returns

RowWiseMinMaxScaler

Parameters

fit(X)

Parameters

Returns

fit_transform(X)

inverse_transform(X)

Parameters

Returns

transform(X)

Parameters

Returns

TransformerPipeline

`AutobotsAssemble`

`apply(transform_type, columns=None, **kwargs)`

`inverse(df=None)`

`inverse_on_external_df(df, columns=None)`

`transform(df)`

`BaseTransformer`

`fit(X)`

`fit_transform(X)`

`inverse_transform(X)`

`transform(X)`

`DSI`

`init(pst=None, data=None, transforms=None, energy_threshold=1.0, rowwise_groups=None, rowwise_fit_groups=None, feature_range=(-1, 1), svd_solver='full', n_components=None, n_iter=4, random_state=None, verbose=False)`

`check_for_pdc()`

`compute_projection_matrix(energy_threshold=None)`

`fit()`

`load(filename)` `classmethod`

`predict(pvals, pst=None)`

`prepare_dsivc(decvar_names, t_d=None, pst=None, oe=None, track_stack=False, dsi_args=None, percentiles=[0.25, 0.75, 0.5], mou_population_size=None, ies_exe_path='pestpp-ies')`

`prepare_pestpp(t_d, observation_data=None, use_runstor=False, pst=None, verbose=False)`

`save(filename)`

`Emulator`

`init(transforms=None, verbose=True)`

`fit(X, y=None)`

`load(filename)` `classmethod`

`predict(X)`

`prepare_pestpp(t_d, pst=None, verbose=False, **kwargs)`

`save(filename)`

`Log10Transformer`

`fit(X)`

`fit_transform(X)`

`NormalScoreTransformer`

`fit(X)`

`fit_transform(X)`

`inverse_transform(X)`

`transform(X)`

`RowWiseMinMaxScaler`

`fit(X)`

`fit_transform(X)`

`inverse_transform(X)`

`transform(X)`

`TransformerPipeline`

`add(transformer, columns=None)`

`fit(X)`

`fit_transform(X)`

`inverse_transform(X)`

`transform(X)`