gents.evaluation.model_free package
Module contents
- class gents.evaluation.model_free.WassersteinDistances(original_data: ndarray, other_data: ndarray, normalisation: str | None = 'none', seed: int | None = None)
Bases:
objectCalculate Wasserstein distance of two datasets in various ways. Addapted form https://gitlab.developers.cam.ac.uk/maths/cia/covid-19-projects/missing_data_fitting_quality
- Parameters:
original_data (np.ndarray) – Original data set, an (n, d) ndarray.
other_data (np.ndarray) – Other data set, which might be imputed or simulated data, also an (n, d) ndarray.
normalisation (str) – Normalisation. Method of normalising data. If ‘none’, no normalisation will be used. If ‘standatdise’, then standardise the data by dividing by the standard deviation of the original data. (There is no need to subtract the mean, as this does not affect the Wasserstein distance.). Defaults to None.
seed (int) – Random seed. Defaults to None
- directional_distance(direction: ndarray) float
Calculate the dataset distance in a specified direction.
This projects the two datasets onto the specified direction (that is, a 1-dimensional subspace), and calculates the Wasserstein distance between the two resulting distributions.
- Parameters:
direction (np.array) – The direction in which to calculate the W_2 distance between the datasets.
- Returns:
distance – The calculated W_2^2 distance.
- Return type:
float
- feature_distance(feature: int) float
Calculate the dataset distance for a specific feature.
This calculates the Wasserstein 2-distance between the specified feature in the two datasets.
- Parameters:
feature (int) – The column number of the feature to consider: 0, 1, 2, …, num_fields - 1.
- Returns:
distance – The Wasserstein 2-distance.
- Return type:
float
- get_marginal_directions() list[ndarray]
Get marginal directions for an experiment.
These are just the standard basis vectors.
- Returns:
directions – A list of standard unit vectors.
- Return type:
list[np.ndarray]
- get_random_directions(n_directions: int) list[ndarray]
Get random directions for an experiment.
- Parameters:
n_directions (int) – The number of directions to produce.
- Returns:
directions – A list of unit vectors specifying the directions to use. The results will be given in the same order.
- Return type:
list[np.ndarray]
- marginal_distances() ndarray
Calculate the marginal Wasserstein distances between datasets.
- Returns:
distribution of Wasserstein distances over all features.
- Return type:
np.ndarray
- random_direction(dim: int) ndarray
Generate a unit vector in a random direction.
- Parameters:
dim (int) – Dimension of vector to be generated.
- Returns:
unit_vector – A unit vector of shape (dim,).
- Return type:
np.ndarray
- sliced_distances(num_directions: int) ndarray
Calculate the sliced Wasserstein distance between datasets.
- Parameters:
num_directions (int) – Number of directions in the sliced Wasserstein estimation.
- Returns:
distribution of Wasserstein distances over all directions.
- Return type:
np.ndarray
- gents.evaluation.model_free.crps(y_true: ndarray, y_pred: ndarray, quantiles=array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])) float
Calculating continuous ranked probability score (CRPS) through Multi-Quantile loss.
Adapated from neuralforecast.losses.numpy.
- Parameters:
y_true (np.ndarray) – Ground truth time series, in shape of [B, T, C]
y_pred (np.ndarray) – Predicted time series scenarios, in shape of [B, T, C, N] (N is the number of scenarios), or [B, T, C, Q] (Q is the number of quantiles).
quantiles (np.array, optional) – Quantile levels. The more levels are, the more accurately CRPS is approximated. Defaults to np.arange(0.1, 1.0, 0.1).
- gents.evaluation.model_free.mse(y_true: ndarray, y_pred: ndarray) float
Calculating Mean Squared Error (MSE).
Adapated from sklearn.metrics.mean_squared_error.
- Parameters:
y_true (np.ndarray) – Ground truth time series, in shape of [B, T, C].
y_pred (np.ndarray) – Predicted time series scenarios, in shape of [B, T, C].