Skip to content

lpfa

Learning-based pattern-data-driven forecast approach (LPFA) emulator implementation.

LPFA

Bases: Emulator

Class for the Learning-based pattern-data-driven forecast approach from Kim et al (2025).

This emulator uses neural networks to learn the relationships between inputs and forecast outputs, with dimensionality reduction via PCA.

Parameters

data : pandas.DataFrame The training data with input and forecast columns. input_names : list List of column names to use as inputs. groups : dict Dictionary mapping group names to lists of column names. Used for row-wise min-max scaling. fit_groups : dict Dictionary mapping group names to lists of column names used to fit the scaling. output_names : list, optional List of column names to forecast. If None, all columns not in input_names are used. energy_threshold : float, optional Energy threshold for the PCA. Default is 1.0. seed : int, optional Random seed for reproducibility. Default is None. early_stop : bool, optional Whether to use early stopping during training. Default is True. transforms : list of dict, optional List of transformation specifications. Each dict should have: - 'type': str - Type of transformation (e.g., 'log10', 'normal_score'). - 'columns': list of str, optional - Columns to apply the transformation to. If not supplied, transformation is applied to all columns. - Additional kwargs for the transformation (e.g., 'quadratic_extrapolation' for normal score transform). Example: transforms = [ {'type': 'log10', 'columns': ['obs1', 'obs2']}, {'type': 'normal_score', 'quadratic_extrapolation': True}
] Default is None, which means no transformations will be applied. verbose : bool, optional If True, enable verbose logging. Default is True.

__init__(data, input_names, groups, fit_groups, output_names=None, energy_threshold=1.0, seed=None, early_stop=True, transforms=None, test_size=0.2, verbose=True)

Initialize the Learning-based pattern-data-driven NN emulator.

Parameters

data : pandas.DataFrame The training data with input and forecast columns. input_names : list List of column names to use as inputs. groups : dict Dictionary mapping group names to lists of column names. Used for row-wise min-max scaling. fit_groups : dict Dictionary mapping group names to lists of column names used to fit the scaling. output_names : list, optional List of column names to forecast. If None, all columns in data will be used. energy_threshold : float, optional Energy threshold for the PCA. Default is 1.0. seed : int, optional Random seed for reproducibility. Default is None. early_stop : bool, optional Whether to use early stopping during training. Default is True. transforms : list of dict, optional List of transformation specifications. Each dict should have: - 'type': str - Type of transformation (e.g.,'log10', 'normal_score'). - 'columns': list of str,optional - Columns to apply the transformation to. If not supplied, transformation is applied to all columns. - Additional kwargs for the transformation (e.g., 'quadratic_extrapolation' for normal score transform). Example: transforms = [ {'type': 'log10', 'columns': ['obs1', 'obs2']}, {'type': 'normal_score', 'quadratic_extrapolation': True} ] Default is None, which means no transformations will be applied. test_size : float, optional Fraction of data to use for testing. Default is 0.2. verbose : bool, optional If True, enable verbose logging. Default is True.

add_noise_model(params=None)

Add a noise model to capture residuals.

Parameters

params : dict, optional Dictionary of model parameters for the noise model. Default is None.

Returns

self : LPFA The emulator instance with noise model added.

create_model(params=None)

Create and store the main model.

Parameters

params : dict, optional Dictionary of model parameters. Default is None.

Returns

self : LPFA The emulator instance with model created.

fit(epochs=200)

Fit the model to the training data.

Parameters

epochs : int, optional Number of training epochs. Default is 200. Returns


self : LPFA The fitted emulator.

load(filename) classmethod

Load a fitted emulator from a file.

Parameters

filename : str Path to the saved emulator file.

Returns

Emulator The loaded emulator instance.

predict(data)

Generate predictions for new data.

Parameters

data : pandas.DataFrame New data to generate predictions for.

Returns

pandas.DataFrame Predictions for the input data.

prepare_pestpp(t_d, pst=None, verbose=False, **kwargs)

Generic method to prepare a PEST++ interface for the emulator.

This method automates the creation of template files, instruction files, control files, and the forward run script needed to run the emulator within a PEST++ workflow (e.g. IES).

Parameters

t_d : str Path to the template directory where files will be written. pst : Pst, optional A Pst object representing the original control file. Useful for scraping constraint weights, observation lists, etc. Subclasses may use this to determine specific parameters or observations. verbose : bool Enable verbose logging.

Returns

Pst The generated Pst object for the emulator.

save(filename)

Save the fitted emulator to a file.

Parameters

filename : str Path to save the emulator.

LPFAModel

Scikit-learn MLPRegressor wrapper for LPFA neural network model.

loss_curve_ property

Get training loss curve

fit(X, y)

Fit the model

predict(X)

Make predictions