A perturbation test is a method used to evaluate a model’s robustness and stability. In machine learning, this test helps determine how sensitive the model’s predictions are to small changes (perturbations) in the input data. If a model is stable, small changes in the input should lead to minimal changes in the output. This method is useful for understanding the model’s behavior and reliability.
# Import necessary libraries
import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
# Create a synthetic dataset for regression
X, y = make_regression(n_samples=100, n_features=5, noise=10, random_state=42)
# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Train a Linear Regression model
reg_model = LinearRegression()
reg_model.fit(X_train, y_train)
# Predict on the original test set
y_pred_original = reg_model.predict(X_test)
# Introduce small perturbations to the test data
np.random.seed(42)
perturbation = np.random.normal(0, 0.01, X_test.shape) # Small noise with mean 0 and standard deviation 0.01
X_test_perturbed = X_test + perturbation
# Predict on the perturbed test set
y_pred_perturbed = reg_model.predict(X_test_perturbed)
# Calculate the Mean Squared Error between original and perturbed predictions
mse_perturbation = mean_squared_error(y_pred_original, y_pred_perturbed)
# Calculate and display the percentage change in predictions
percentage_change = np.mean(np.abs((y_pred_perturbed - y_pred_original) / y_pred_original)) * 100
(mse_perturbation, percentage_change)
Result
(1.5338425690313238, 1.828950400218865)
Here are the results of the perturbation test:
- Mean Squared Error (MSE) between original and perturbed predictions: 1.53. This indicates the average squared difference between predictions on the original and perturbed datasets.
- Percentage change in predictions: 1.83%. This suggests that, on average, the model’s predictions changed by around 1.83% when small perturbations were introduced to the test features.
This relatively small change in predictions indicates that the model is fairly stable under small perturbations in the input data, which is a desirable property for robustness.