transformers
Transformer classes for data transformations in emulators.
AutobotsAssemble
Class for transforming features in a DataFrame using a pipeline approach.
apply(transform_type, columns=None, **kwargs)
Apply a transformation to specified columns.
inverse(df=None)
Apply inverse transformations in reverse order.
inverse_on_external_df(df, columns=None)
Apply inverse transformations to an external DataFrame.
Parameters
df : pandas.DataFrame The DataFrame to inverse transform. columns : list, optional Specific columns to inverse transform. If None, all columns are processed.
Returns
pandas.DataFrame The inverse-transformed DataFrame.
transform(df)
Transform an external DataFrame using the pipeline.
Parameters
df : pandas.DataFrame The DataFrame to transform.
Returns
pandas.DataFrame The transformed DataFrame.
BaseTransformer
Base class for all transformers providing a consistent interface.
fit(X)
Learn parameters from data if needed.
fit_transform(X)
Fit and transform in one step.
inverse_transform(X)
Inverse transform X back to original space.
transform(X)
Apply transformation to X.
GenericTransformer
Bases: BaseTransformer
Wrapper for generic sklearn-compatible transformers.
Parameters
transformer_class : class The class of the transformer to be used (e.g. sklearn.preprocessing.QuantileTransformer). kwargs : dict Arguments to be passed to the transformer constructor.
fit_transform(X)
Fit and transform in one step.
Log10Transformer
Bases: BaseTransformer
Apply log10 transformation.
Parameters
columns : list, optional List of column names to be transformed. If None, all columns will be transformed.
fit(X)
Learn parameters from data if needed.
fit_transform(X)
Fit and transform in one step.
MinMaxScaler
Bases: BaseTransformer
Scale each column of a DataFrame to a specified range.
Parameters
feature_range : tuple (min, max), default=(-1, 1) The range to scale features into. columns : list, optional List of column names to be scaled. If None, all columns will be scaled. skip_constant : bool, optional If True, columns with constant values will be skipped. Default is True.
fit(X)
Learn min and max values for scaling.
Parameters
X : pandas.DataFrame The DataFrame to fit the scaler on.
Returns
self : object Returns self.
fit_transform(X)
Fit and transform in one step.
inverse_transform(X)
Undo the scaling of X according to feature_range.
Parameters
X : pandas.DataFrame The DataFrame to inverse transform.
Returns
pandas.DataFrame The inverse-transformed DataFrame.
transform(X)
Scale features according to feature_range.
Parameters
X : pandas.DataFrame The DataFrame to transform.
Returns
pandas.DataFrame The transformed DataFrame.
NormalScoreTransformer
Bases: BaseTransformer
A transformer for normal score transformation.
Parameters
tol : float, default=1e-7
Tolerance for convergence of the Monte-Carlo z-score generator.
Only used when method='montecarlo'.
max_samples : int, default=1000000
Maximum number of Monte-Carlo replicates. Only used when
method='montecarlo'.
quadratic_extrapolation : bool, default=False
Whether to use quadratic extrapolation for values outside the fitted range.
columns : list, optional
List of column names to be transformed. If None, all columns will be transformed.
method : {'blom', 'montecarlo'}, default='blom'
How to estimate the expected order statistics E[Z_(i:n)] of N(0,1).
- 'blom' (default): closed-form Blom plotting positions
Phi^-1((i - 3/8) / (n + 1/4)). Fast, deterministic. The
systematic bias at the extreme tails is small (~0.01–0.015 in
absolute z, growing slowly with n) and negligible for typical
DSI use.
- 'montecarlo': the original iterative estimator — repeatedly draw
n standard normals, sort, and average until the running mean
stabilises to tol or max_samples is reached. Convergent to
the true expectation but ~10^4–10^5x slower than 'blom'. Useful
when extreme-tail accuracy matters or for cross-validation
against the closed-form approximation.
fit(X)
Fit the transformer to the data.
fit_transform(X)
Fit and transform in one step.
inverse_transform(X)
Inverse transform data back to original space.
Parameters
X : pandas.DataFrame The DataFrame with transformed data to inverse transform.
Returns
pandas.DataFrame The inverse-transformed DataFrame.
transform(X)
Transform the data using normal score transformation.
Parameters
X : pandas.DataFrame The DataFrame to transform.
Returns
pandas.DataFrame The transformed DataFrame with normal scores.
RowWiseMinMaxScaler
Bases: BaseTransformer
Scale each row of a DataFrame to a specified range.
Parameters
feature_range : tuple (min, max), default=(-1, 1) The range to scale features into. groups : dict or None, default=None Dict mapping group names to lists of column names to be scaled together (entire timeseries for that group). If None, all columns will be treated as a single group. Example: {'group1': ['col1', 'col2'], 'group2': ['col3', 'col4']} fit_groups : dict or None, default=None Dict mapping group names to lists of column names (subset of groups) used to compute row-wise min and max. If None, defaults to using the same columns as in groups.
fit(X)
Compute row-wise min and max for each group.
Parameters
X : pandas.DataFrame The DataFrame to fit the scaler on.
Returns
self : object Returns self.
fit_transform(X)
Fit and transform in one step.
inverse_transform(X)
Inverse transform data back to the original scale.
Parameters
X : pandas.DataFrame The DataFrame to inverse transform.
Returns
pandas.DataFrame The inverse-transformed DataFrame.
transform(X)
Scale each row of data to the specified range.
Parameters
X : pandas.DataFrame The DataFrame to transform.
Returns
pandas.DataFrame The transformed DataFrame.
StandardScalerTransformer
Bases: BaseTransformer
Wrapper around sklearn's StandardScaler for DataFrame compatibility.
Parameters
with_mean : bool, default=True If True, center the data before scaling. with_std : bool, default=True If True, scale the data to unit variance. copy : bool, default=True If True, a copy of X will be created. If False, centering and scaling happen in-place. columns : list, optional List of column names to be transformed. If None, all columns will be transformed.
fit_transform(X)
Fit and transform in one step.
TransformerPipeline
Apply a sequence of transformers in order.
add(transformer, columns=None)
Add a transformer to the pipeline, optionally for specific columns.
fit(X)
Fit all transformers in the pipeline.
fit_transform(X)
Fit all transformers and transform data in one operation.
inverse_transform(X)
Apply inverse transformations in reverse order.
Parameters
X : pandas.DataFrame The DataFrame to inverse transform.
Returns
pandas.DataFrame The inverse-transformed DataFrame.
transform(X)
Transform data using all transformers in the pipeline.
Parameters
X : pandas.DataFrame The DataFrame to transform.
Returns
pandas.DataFrame The transformed DataFrame.