Skip to content
FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS
Close
Beyond Knowledge Innovation

Beyond Knowledge Innovation

Where Data Unveils Possibilities

  • Home
  • AI & ML Insights
  • Machine Learning
    • Supervised Learning
      • Introduction
      • Regression
      • Classification
    • Unsupervised Learning
      • Introduction
      • Clustering
      • Association
      • Dimensionality Reduction
    • Reinforcement Learning
    • Generative AI
  • Knowledge Base
    • Introduction To Python
    • Introduction To Data
    • Introduction to EDA
  • References
HomeImplementationUnsupervised LearningClusteringWhat is Mahalanobis Distance
Clustering Statistics

What is Mahalanobis Distance

March 11, 2024March 11, 2024CEO 189 views
The Mahalanobis distance is a measure of the distance between a point and a distribution, taking into account the correlation between variables. It is often used in statistics and machine learning to identify outliers and to assess the dissimilarity between a data point and a distribution.

The Mahalanobis distance is defined for a point (x) with respect to a distribution characterized by its mean vector (\(\mu\)) and covariance matrix (\(\Sigma\)) as follows:

\(D_M(x) = \sqrt{(x – \mu)^T \Sigma^{-1} (x – \mu)} \)

Here:

  • (\(D_M(x)\)) is the Mahalanobis distance for the point (x).
  • (x) is the vector representing the data point.
  • (\(\mu\)) is the mean vector of the distribution.
  • (\(\Sigma\)) is the covariance matrix of the distribution.
  • (\(\Sigma^{-1}\)) is the inverse of the covariance matrix.

The Mahalanobis distance accounts for the correlations between different features in the data, which makes it particularly useful when dealing with multivariate data. It is a normalized distance metric, providing a measure of how many standard deviations a data point is from the mean along each dimension, considering the correlations.

Applications of Mahalanobis distance include outlier detection, clustering, and classification. In outlier detection, data points with unusually large Mahalanobis distances from the mean of a distribution are considered potential outliers. The Mahalanobis distance is also used in the Mahalanobis-Taguchi System, a technique for quality engineering and process optimization.

clustering, distance, mahalanobis, unsupervised

Post navigation

Previous Post
Previous post: What is Jaccard Distance
Next Post
Next post: What is Silhouette Coefficient

You Might Also Like

No image
t-distributed Stochastic Neighbor Embedding (t-SNE)
March 17, 2024 Comments Off on t-distributed Stochastic Neighbor Embedding (t-SNE)
No image
Principal Component Analysis (PCA)
March 15, 2024 Comments Off on Principal Component Analysis (PCA)
No image
Unsupervised Learning Dimensionality Reduction – Feature Elimination…
March 15, 2024 Comments Off on Unsupervised Learning Dimensionality Reduction – Feature Elimination vs Extraction
No image
Complete linkage hierarchical clustering
March 15, 2024 Comments Off on Complete linkage hierarchical clustering
No image
Single linkage hierarchical clustering
March 15, 2024 Comments Off on Single linkage hierarchical clustering
  • Recent
  • Popular
  • Random
  • No image
    7 months ago Low-Rank Factorization
  • No image
    7 months ago Perturbation Test for a Regression Model
  • No image
    7 months ago Calibration Curve for Classification Models
  • No image
    March 15, 20240Single linkage hierarchical clustering
  • No image
    April 17, 20240XGBoost (eXtreme Gradient Boosting)
  • No image
    April 17, 20240Gradient Boosting
  • No image
    February 11, 2024How-to: save a Google Colab notebook as…
  • No image
    January 18, 2024What is NumPy?
  • No image
    February 11, 2024How-to: stack up two plots using the…
  • Implementation (55)
    • EDA (4)
    • Neural Networks (10)
    • Supervised Learning (26)
      • Classification (17)
      • Linear Regression (8)
    • Unsupervised Learning (11)
      • Clustering (8)
      • Dimensionality Reduction (3)
  • Knowledge Base (44)
    • Python (27)
    • Statistics (6)
June 2025
M T W T F S S
 1
2345678
9101112131415
16171819202122
23242526272829
30  
« Oct    

We are on

FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS

Subscribe

© 2025 Beyond Knowledge Innovation
FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS