Model coefficients, also known as regression coefficients or weights, are the values assigned to the features (independent variables) in a regression model. In a linear regression model, the relationship between the input features (X) and the predicted output (y) is represented as:
Here:
- \((y)\) is the predicted output.
- \((\beta_0)\) is the intercept term, representing the value of (y) when all input features are zero.
- \((\beta_1, \beta_2, \ldots, \beta_n)\) are the coefficients assigned to the corresponding input features \((x_1, x_2, \ldots, x_n)\).
The model coefficients are estimated during the training of the regression model. The goal of the training process is to find the values of \((\beta_0, \beta_1, \ldots, \beta_n)\) that minimize the difference between the predicted values and the actual values in the training data.
The coefficients provide information about the strength and direction of the relationship between each feature and the target variable. Positive coefficients indicate a positive correlation, while negative coefficients indicate a negative correlation. The magnitude of the coefficient reflects the impact of the corresponding feature on the predicted output. Larger magnitudes imply a stronger influence.
Example:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
regression_model = LinearRegression()
regression_model.fit(X_train, y_train)
pd.DataFrame(np.append(regression_model.coef_, regression_model.intercept_),index=X_train.columns.tolist() + ['Intercept'] ,columns=['Coefficients'])
Coefficients
CRIM -0.113845
ZN 0.061170
INDUS 0.054103
CHAS 2.517512
NX -22.248502
RM 2.698413
AGE 0.004836
DIS -1.534295
RAD 0.298833
TAX -0.011414
PTRATIO -0.988915
LSTAT -0.586133
Intercept 49.885235
Equation of the fit
# Let us write the equation of linear regression
Equation = "Price = " + str(regression_model.intercept_)
print(Equation, end=" ")
for i in range(len(X_train.columns)):
if i != len(X_train.columns) - 1:
print(f"+ ({regression_model.coef_[i]})*{X_train.columns[i]}")
else:
print(f"+ ({regression_model.coef_[i]})*{X_train.columns[i]}")
Price = 49.88523466381736
+ (-0.11384484836914008)*CRIM
+ (0.06117026804060645)*ZN
+ (0.05410346495874601)*INDUS
+ (2.5175119591227144)*CHAS
+ (-22.248502345084372)*NX
+ (2.6984128200099113)*RM
+ (0.004836047284751951)*AGE
+ (-1.5342953819992557)*DIS
+ (0.29883325485901313)*RAD
+ (-0.011413580552025043)*TAX
+ (-0.9889146257039406)*PTRATIO
+ (-0.5861328508499133)*LSTAT