Skip to content
FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS
Close
Beyond Knowledge Innovation

Beyond Knowledge Innovation

Where Data Unveils Possibilities

  • Home
  • AI & ML Insights
  • Machine Learning
    • Supervised Learning
      • Introduction
      • Regression
      • Classification
    • Unsupervised Learning
      • Introduction
      • Clustering
      • Association
      • Dimensionality Reduction
    • Reinforcement Learning
    • Generative AI
  • Knowledge Base
    • Introduction To Python
    • Introduction To Data
    • Introduction to EDA
  • References
HomeImplementationSupervised LearningClassificationFeature Importance in Decision Tree
Classification

Feature Importance in Decision Tree

March 7, 2024March 7, 2024CEO 212 views

In scikit-learn, the feature_importances_ attribute is associated with tree-based models, such as Decision Trees, Random Forests, and Gradient Boosted Trees. This attribute provides a way to assess the importance of each feature (or variable) in making predictions with the trained model.

When you train a tree-based model, the algorithm makes decisions at each node based on the values of specific features. The feature_importances_ attribute represents the relative importance of each feature in making these decisions.

Here’s a simple example using a Decision Tree Classifier:

from sklearn.tree import DecisionTreeClassifier
import matplotlib.pyplot as plt

# Create a decision tree classifier
clf = DecisionTreeClassifier()

# Assuming X_train and y_train are your training data
clf.fit(X_train, y_train)

# Access feature importances
feature_importances = clf.feature_importances_

# Print or visualize feature importances
print (pd.DataFrame(feature_importances, columns = ["Imp"], index = X_train.columns).sort_values(by = 'Imp', ascending = False))

#or

for feature, importance in zip(X_train.columns, feature_importances):
    print(f"{feature}: {importance}")
                                   Imp
amount                        0.204163
checking_balance              0.136840
age                           0.110746
months_loan_duration          0.100323
employment_duration           0.073225
credit_history                0.065357
savings_balance               0.057059
years_at_residence            0.052719
percent_of_income             0.034128
purpose_business              0.023784
dependents                    0.023062

The bar plot visualizes the relative importance of each feature.

# Plotting feature importances
feature_names = list(X.columns)
importances = clf.feature_importances_
indices = np.argsort(importances)

plt.figure(figsize=(12,12))
plt.title('Feature Importances')
plt.barh(range(len(indices)), importances[indices], color='violet', align='center')
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel('Relative Importance')
plt.show()
decision tree, feature importance, tree

Post navigation

Previous Post
Previous post: Visualizing the Decision Tree
Next Post
Next post: Pre-pruning Decision Tree – depth restricted

You Might Also Like

No image
BaggingClassifier from Scikit-Learn
April 7, 2024 Comments Off on BaggingClassifier from Scikit-Learn
No image
Post-pruning Decision Tree with Cost Complexity Parameter…
March 8, 2024 Comments Off on Post-pruning Decision Tree with Cost Complexity Parameter ccp_alpha
No image
Pre-pruning Decision Tree – GridSearch for Hyperparameter…
March 8, 2024 Comments Off on Pre-pruning Decision Tree – GridSearch for Hyperparameter tuning
No image
Pre-pruning Decision Tree – depth restricted
March 8, 2024 Comments Off on Pre-pruning Decision Tree – depth restricted
No image
Visualizing the Decision Tree
March 7, 2024 Comments Off on Visualizing the Decision Tree
  • Recent
  • Popular
  • Random
  • No image
    7 months ago Low-Rank Factorization
  • No image
    7 months ago Perturbation Test for a Regression Model
  • No image
    7 months ago Calibration Curve for Classification Models
  • No image
    March 15, 20240Single linkage hierarchical clustering
  • No image
    April 17, 20240XGBoost (eXtreme Gradient Boosting)
  • No image
    April 17, 20240Gradient Boosting
  • No image
    January 16, 2024Feature Engineering: Scaling, Normalization, and Standardization
  • No image
    January 16, 2024Process of Fitting the models in machine…
  • No image
    February 9, 2024What is Gaussian Distribution?
  • Implementation (55)
    • EDA (4)
    • Neural Networks (10)
    • Supervised Learning (26)
      • Classification (17)
      • Linear Regression (8)
    • Unsupervised Learning (11)
      • Clustering (8)
      • Dimensionality Reduction (3)
  • Knowledge Base (44)
    • Python (27)
    • Statistics (6)
May 2025
M T W T F S S
 1234
567891011
12131415161718
19202122232425
262728293031  
« Oct    

We are on

FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS

Subscribe

© 2025 Beyond Knowledge Innovation
FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS