Skip to content

emulators

AutobotsAssemble

Class for transforming features in a DataFrame using a pipeline approach.

apply(transform_type, columns=None, **kwargs)

Apply a transformation to specified columns.

inverse(df=None)

Apply inverse transformations in reverse order.

inverse_on_external_df(df, columns=None)

Apply inverse transformations to an external DataFrame.

Parameters

df : pandas.DataFrame The DataFrame to inverse transform. columns : list, optional Specific columns to inverse transform. If None, all columns are processed.

Returns

pandas.DataFrame The inverse-transformed DataFrame.

transform(df)

Transform an external DataFrame using the pipeline.

Parameters

df : pandas.DataFrame The DataFrame to transform.

Returns

pandas.DataFrame The transformed DataFrame.

BaseTransformer

Base class for all transformers providing a consistent interface.

fit(X)

Learn parameters from data if needed.

fit_transform(X)

Fit and transform in one step.

inverse_transform(X)

Inverse transform X back to original space.

transform(X)

Apply transformation to X.

DSI

Bases: Emulator

Data Space Inversion (DSI) emulator class. Based on DSI as described in Sun & Durlofsky (2017) and Sun et al (2017).

__init__(pst=None, data=None, transforms=None, energy_threshold=1.0, rowwise_groups=None, rowwise_fit_groups=None, feature_range=(-1, 1), verbose=False)

Initialize the DSI emulator.

If rowwise_groups is provided, training data are row-wise scaled per-group before SVD. Predictions are returned in scaled space and then inverse-scaled using per-row parameters derived from truth values found in pst.observation_data.

Parameters

pst : Pst, optional A Pst object. If provided, the emulator will be initialized with the information from the Pst object. data : DataFrame or ObservationEnsemble, optional An ensemble of simulated observations. If provided, the emulator will be initialized with the information from the ensemble. transforms : list of dict, optional List of transformation specifications. Each dict should have: - 'type': str - Type of transformation (e.g.,'log10', 'normal_score'). - 'columns': list of str,optional - Columns to apply the transformation to. If not supplied, transformation is applied to all columns. - Additional kwargs for the transformation (e.g., 'quadratic_extrapolation' for normal score transform). Example: transforms = [ {'type': 'log10', 'columns': ['obs1', 'obs2']}, {'type': 'normal_score', 'quadratic_extrapolation': True} ] Default is None, which means no transformations will be applied. energy_threshold : float, optional The energy threshold for the SVD. Default is 1.0, no truncation. rowwise_groups : dict, optional Dictionary mapping groups to column lists for row-wise scaling. rowwise_fit_groups : dict, optional Dictionary mapping groups to column lists for fitting row-wise scalers. feature_range : tuple, optional Feature range for row-wise scaling. Default is (-1, 1). verbose : bool, optional If True, enable verbose logging. Default is False.

check_for_pdc()

Check for Prior data conflict.

compute_projection_matrix(energy_threshold=None)

Compute the projection matrix using SVD.

Parameters

energy_threshold : float, optional Energy threshold for truncation. Default is None, which uses the threshold from initialization.

Returns

None

fit()

Fit the emulator to training data.

Parameters

self : DSI The DSI emulator instance.

Returns

self : DSI The fitted emulator.

load(filename) classmethod

Load a fitted emulator from a file.

Parameters

filename : str Path to the saved emulator file.

Returns

Emulator The loaded emulator instance.

predict(pvals, pst=None)

Generate predictions from the emulator.

Parameters

pvals : numpy.ndarray or pandas.Series Parameter values for prediction. pst : Pst, optional If provided (or if self.observation_data exists), used to obtain truth values for inverse row-wise scaling (if enabled).

Returns

pandas.Series Predicted observation values.

prepare_dsivc(decvar_names, t_d=None, pst=None, oe=None, track_stack=False, dsi_args=None, percentiles=[0.25, 0.75, 0.5], mou_population_size=None, ies_exe_path='pestpp-ies')

Prepare Data Space Inversion Variable Control (DSIVC) control files.

Parameters

decvar_names : list or str Names of decision variables. t_d : str, optional Template directory path. pst : Pst, optional PST control file object. oe : ObservationEnsemble, optional Observation ensemble. track_stack : bool, optional Whether to track the stack. Default is False. dsi_args : dict, optional Arguments for DSI. percentiles : list, optional Percentiles to calculate. Default is [0.25, 0.75, 0.5]. mou_population_size : int, optional Population size for multi-objective optimization. ies_exe_path : str, optional Path to the PEST++ IES executable. Default is "pestpp-ies". Returns


Pst PEST++ control file object for DSIVC.

prepare_pestpp(t_d, observation_data=None, use_runstor=False, pst=None, verbose=False)

Prepare PEST++ interface for DSI. Overrides base method to handle specific DSI arguments like use_runstor

save(filename)

Save the fitted emulator to a file.

Parameters

filename : str Path to save the emulator.

Emulator

Base class for emulators.

This class defines the common interface for all emulator implementations and provides shared functionality used by multiple emulator types.

__init__(transforms=None, verbose=True)

Initialize the Emulator base class.

Parameters

transforms : list of dict, optional List of transformation specifications. Each dict should have: - 'type': str - Type of transformation (e.g.,'log10', 'normal_score'). - 'columns': list of str,optional - Columns to apply the transformation to. If not supplied, transformation is applied to all columns. - Additional kwargs for the transformation (e.g., 'quadratic_extrapolation' for normal score transform). Example: transforms = [ {'type': 'log10', 'columns': ['obs1', 'obs2']}, {'type': 'normal_score', 'quadratic_extrapolation': True} ] Default is None, which means no transformations will be applied. verbose : bool, optional If True, enable verbose logging. Default is True.

fit(X, y=None)

Fit the emulator to training data.

Parameters

X : pandas.DataFrame Input features for training. y : pandas.DataFrame or None, optional Target values for training if separate from X.

Returns

self : Emulator Returns self for method chaining.

load(filename) classmethod

Load a fitted emulator from a file.

Parameters

filename : str Path to the saved emulator file.

Returns

Emulator The loaded emulator instance.

predict(X)

Generate predictions using the fitted emulator.

Parameters

X : pandas.DataFrame Input data to generate predictions for.

Returns

pandas.DataFrame or pandas.Series Predictions for the input data.

prepare_pestpp(t_d, pst=None, verbose=False, **kwargs)

Generic method to prepare a PEST++ interface for the emulator.

This method automates the creation of template files, instruction files, control files, and the forward run script needed to run the emulator within a PEST++ workflow (e.g. IES).

Parameters

t_d : str Path to the template directory where files will be written. pst : Pst, optional A Pst object representing the original control file. Useful for scraping constraint weights, observation lists, etc. Subclasses may use this to determine specific parameters or observations. verbose : bool Enable verbose logging.

Returns

Pst The generated Pst object for the emulator.

save(filename)

Save the fitted emulator to a file.

Parameters

filename : str Path to save the emulator.

Log10Transformer

Bases: BaseTransformer

Apply log10 transformation.

Parameters

columns : list, optional List of column names to be transformed. If None, all columns will be transformed.

fit(X)

Learn parameters from data if needed.

fit_transform(X)

Fit and transform in one step.

NormalScoreTransformer

Bases: BaseTransformer

A transformer for normal score transformation.

Parameters

tol : float, default=1e-7 Tolerance for convergence in random generation. max_samples : int, default=1000000 Maximum number of samples for random generation. quadratic_extrapolation : bool, default=False Whether to use quadratic extrapolation for values outside the fitted range. columns : list, optional List of column names to be transformed. If None, all columns will be transformed.

fit(X)

Fit the transformer to the data.

fit_transform(X)

Fit and transform in one step.

inverse_transform(X)

Inverse transform data back to original space.

Parameters

X : pandas.DataFrame The DataFrame with transformed data to inverse transform.

Returns

pandas.DataFrame The inverse-transformed DataFrame.

transform(X)

Transform the data using normal score transformation.

Parameters

X : pandas.DataFrame The DataFrame to transform.

Returns

pandas.DataFrame The transformed DataFrame with normal scores.

RowWiseMinMaxScaler

Bases: BaseTransformer

Scale each row of a DataFrame to a specified range.

Parameters

feature_range : tuple (min, max), default=(-1, 1) The range to scale features into. groups : dict or None, default=None Dict mapping group names to lists of column names to be scaled together (entire timeseries for that group). If None, all columns will be treated as a single group. Example: {'group1': ['col1', 'col2'], 'group2': ['col3', 'col4']} fit_groups : dict or None, default=None Dict mapping group names to lists of column names (subset of groups) used to compute row-wise min and max. If None, defaults to using the same columns as in groups.

fit(X)

Compute row-wise min and max for each group.

Parameters

X : pandas.DataFrame The DataFrame to fit the scaler on.

Returns

self : object Returns self.

fit_transform(X)

Fit and transform in one step.

inverse_transform(X)

Inverse transform data back to the original scale.

Parameters

X : pandas.DataFrame The DataFrame to inverse transform.

Returns

pandas.DataFrame The inverse-transformed DataFrame.

transform(X)

Scale each row of data to the specified range.

Parameters

X : pandas.DataFrame The DataFrame to transform.

Returns

pandas.DataFrame The transformed DataFrame.

TransformerPipeline

Apply a sequence of transformers in order.

add(transformer, columns=None)

Add a transformer to the pipeline, optionally for specific columns.

fit(X)

Fit all transformers in the pipeline.

fit_transform(X)

Fit all transformers and transform data in one operation.

inverse_transform(X)

Apply inverse transformations in reverse order.

Parameters

X : pandas.DataFrame The DataFrame to inverse transform.

Returns

pandas.DataFrame The inverse-transformed DataFrame.

transform(X)

Transform data using all transformers in the pipeline.

Parameters

X : pandas.DataFrame The DataFrame to transform.

Returns

pandas.DataFrame The transformed DataFrame.