Classification

Classification is a form of supervised machine learning in which you train a model to use the features (the x values in our function) to predict a label (y) that calculates the probability of the observed case belonging to each of several possible classes and predicting an appropriate label. The simplest form of classification is binary classification, in which the label is 0 or 1, representing one of two classes: for example, “True” or “False”, “Internal” or “External”, “Profitable” or “Non-Profitable”, when filtering emails “spam” or “not spam”, when looking at transaction data, “fraudulent”, or “authorized” and so on.

Common applications of Classification:

Spam filtering: Identifying spam emails in an inbox
Fraud detection: Detecting fraudulent transactions in financial data
Medical diagnosis: Diagnosing diseases based on patient data
Image recognition: Recognizing objects or scene in images
Recommender system: Recommending products or services to users based on their past behaviors.

Types of Classification

Binary Classification: is the problem of classifying instances into one of two categories. The data we want to classify belongs exclusively to one of those classes. For example, we could label patients as non-diabetic or diabetic. The class prediction is made by determining the probability for each class as a value between 0 (impossible) and 1 (certain). A threshold value, often 0.5, is used to determine the predicted class.

Multiclass Classification: is the problem of classifying instances into one of three or more categories. The data we want to classify belongs exclusively to one of those classes, e.g. to classify if an object on an image is red, green or blue. Multiclass classification can be thought of as a combination of multiple binary classifiers. There are two ways in which you approach the problem:

in which a classifier is created for each possible class value, with a positive outcome for cases where the prediction is this class, and negative predictions for cases where the prediction is any other class. For example, a classification problem with four possible shape classes (square, circle, triangle, hexagon) would require four classifiers that predict:

square or not
circle or not
triangle or not
hexagon or not

Multi-label Classification: is the problem of classifying instances into two or more classes in which the data we want to classify may belong to none or multiple classes (or all) at the same time, e.g. to classify which traffic signs are contained on an image. Neural network models can be configured to support multi-label classification and can perform well, depending on the specifics of the classification task.

Imbalanced Classification: An imbalanced classification problem is an example of a classification problem where the distribution of examples across the known classes is biased or skewed. The distribution can vary from a slight bias to a severe imbalance where there is one example in the minority class for hundreds, thousands, or millions of examples in the majority class or classes.

Go back to Supervised Learning

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Types of Classification

One vs Rest (OVR)

One vs One (OVO)