Difference between R-square and Adjusted R-square

Every time you add an independent variable to a model, the R-squared increases, even if the independent variable is insignificant. It never declines. Whereas Adjusted R-squared increases only when independent variable is significant and affects dependent variable.

\(\bar{R}^2 = 1 – (1 – R^2) \times \frac{(n – 1)}{(n – k – 1)}\)

where:

R²: The R² of the model
n: The number of observations
k: The number of predictor variables

Example 1: Calculate Adjusted R-Squared with sklearn

from sklearn.linear_model import LinearRegression
import pandas as pd

#define URL where dataset is located
url = "https://raw.githubusercontent.com/Statology/Python-Guides/main/mtcars.csv"

#read in data
data = pd.read_csv(url)

#fit regression model
model = LinearRegression()
X, y = data[["mpg", "wt", "drat", "qsec"]], data.hp
model.fit(X, y)

#display adjusted R-squared
1 - (1-model.score(X, y))*(len(y)-1)/(len(y)-X.shape[1]-1)

0.7787005290062521

Example 2: Calculate Adjusted R-Squared with statsmodels

import statsmodels.api as sm
import pandas as pd

#define URL where dataset is located
url = "https://raw.githubusercontent.com/Statology/Python-Guides/main/mtcars.csv"

#read in data
data = pd.read_csv(url)

#fit regression model
X, y = data[["mpg", "wt", "drat", "qsec"]], data.hp
X = sm.add_constant(X)
model = sm.OLS(y, X).fit()

#display adjusted R-squared
print(model.rsquared_adj)

0.7787005290062521

A sample function to compute adjusted R-Squared:

def adj_r2_score(inputs, actuals, predictions):
    r2 = r2_score(actuals, predictions)
    n = inputs.shape[0]  #number of rows
    k = inputs.shape[1]  #number of columns
    return 1 - ((1 - r2) * (n - 1) / (n - k - 1))

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30