Skip to content
FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS
Close
Beyond Knowledge Innovation

Beyond Knowledge Innovation

Where Data Unveils Possibilities

  • Home
  • AI & ML Insights
  • Machine Learning
    • Supervised Learning
      • Introduction
      • Regression
      • Classification
    • Unsupervised Learning
      • Introduction
      • Clustering
      • Association
      • Dimensionality Reduction
    • Reinforcement Learning
    • Generative AI
  • Knowledge Base
    • Introduction To Python
    • Introduction To Data
    • Introduction to EDA
  • References
HomeSupervised LearningClassification

Classification

Classification is a form of supervised machine learning in which you train a model to use the features (the x values in our function) to predict a label (y) that calculates the probability of the observed case belonging to each of several possible classes and predicting an appropriate label. The simplest form of classification is binary classification, in which the label is 0 or 1, representing one of two classes: for example, “True” or “False”, “Internal” or “External”, “Profitable” or “Non-Profitable”, when filtering emails “spam” or “not spam”, when looking at transaction data, “fraudulent”, or “authorized” and so on.

Common applications of Classification:

  • Spam filtering: Identifying spam emails in an inbox
  • Fraud detection: Detecting fraudulent transactions in financial data
  • Medical diagnosis: Diagnosing diseases based on patient data
  • Image recognition: Recognizing objects or scene in images
  • Recommender system: Recommending products or services to users based on their past behaviors.

Types of Classification

Binary Classification: is the problem of classifying instances into one of two categories. The data we want to classify belongs exclusively to one of those classes. For example, we could label patients as non-diabetic or diabetic. The class prediction is made by determining the probability for each class as a value between 0 (impossible) and 1 (certain). A threshold value, often 0.5, is used to determine the predicted class.

Multiclass Classification: is the problem of classifying instances into one of three or more categories. The data we want to classify belongs exclusively to one of those classes, e.g. to classify if an object on an image is red, green or blue. Multiclass classification can be thought of as a combination of multiple binary classifiers. There are two ways in which you approach the problem:

  • One vs Rest (OVR)
  • One vs One (OVO)
in which a classifier is created for each possible class value, with a positive outcome for cases where the prediction is this class, and negative predictions for cases where the prediction is any other class. For example, a classification problem with four possible shape classes (square, circle, triangle, hexagon) would require four classifiers that predict:

  • square or not
  • circle or not
  • triangle or not
  • hexagon or not
in which a classifier for each possible pair of classes is created. The classification problem with four shape classes would require the following binary classifiers:

  • square or circle
  • square or triangle
  • square or hexagon
  • circle or triangle
  • circle or hexagon
  • triangle or hexagon

Multi-label Classification: is the problem of classifying instances into two or more classes in which the data we want to classify may belong to none or multiple classes (or all) at the same time, e.g. to classify which traffic signs are contained on an image. Neural network models can be configured to support multi-label classification and can perform well, depending on the specifics of the classification task.

Imbalanced Classification: An imbalanced classification problem is an example of a classification problem where the distribution of examples across the known classes is biased or skewed. The distribution can vary from a slight bias to a severe imbalance where there is one example in the minority class for hundreds, thousands, or millions of examples in the majority class or classes.

Go back to Supervised Learning

January 14, 2024
CEO
  • Recent
  • Popular
  • Random
  • No image
    7 months ago Low-Rank Factorization
  • No image
    7 months ago Perturbation Test for a Regression Model
  • No image
    7 months ago Calibration Curve for Classification Models
  • No image
    March 15, 20240Single linkage hierarchical clustering
  • No image
    April 17, 20240XGBoost (eXtreme Gradient Boosting)
  • No image
    April 17, 20240Gradient Boosting
  • No image
    June 2, 2024Building a CNN model for Fashion MNIST…
  • No image
    March 5, 2024Receiver Operating Characteristic (ROC) and Area Under…
  • No image
    October 21, 2024Perturbation Test for a Regression Model
  • Implementation (55)
    • EDA (4)
    • Neural Networks (10)
    • Supervised Learning (26)
      • Classification (17)
      • Linear Regression (8)
    • Unsupervised Learning (11)
      • Clustering (8)
      • Dimensionality Reduction (3)
  • Knowledge Base (44)
    • Python (27)
    • Statistics (6)
May 2025
M T W T F S S
 1234
567891011
12131415161718
19202122232425
262728293031  
« Oct    

We are on

FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS

Subscribe

© 2025 Beyond Knowledge Innovation
FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS