
When a machine is made to learn from past experiences, especially if the “past experiences” is a labelled dataset, the phenomenon is known as supervised machine learning.
Supervised machine learning has vast applications in different sectors like finance, healthcare, government, business, data analysis. In this section, we’ll explore different algorithms employed under supervised machine learning to make predictions.
Algorithms
Supervised learning is divided into two main categories, Regression and Classification.
Regression is when the algorithm makes a quantitative analysis between an independent variable and a dependent variable and gives a numerical output.
Classification is when the algorithm makes a qualitative analysis between two variables and provides a yes or no result.
The algorithms of supervised learning are further categorized in the following:
- Linear Regression
It is a type of supervised learning algorithm that extracts a linear relationship between a dependent and independent variable. It is further divided into four more types based on the type and number of variables.
- Simple linear regression – Single independent variable
- Multiple linear regression – Multiple independent variables
- Univariate regression – Single dependent variable
- Multivariate regression – Multiple dependent variables
Linear regression is the simplest algorithm and most used because of this virtue. It serves as the foundation for other complex models. It gives a straight line at 45 degrees on the scatter plot following the equation,
y=mx+c
Where y = dependent variable, x = independent variable, m = slope, c = y-intercept (where y is 0)
Prediction of the spreading of a certain disease on a crop and its effect on its yield is one of the several applications of linear regression.
- Non-Linear Regression
More complex and non-linear relationships are discovered using the non-linear regression technique between the independent and dependent variable. The relationship can be logarithmic, exponential, polynomial, sigmoidal etc.
The goal is to find the best possible curve that matches the data points, to minimize the difference between the observed data and match data.
It can be used in a laboratory to model the growth of bacteria in a petri dish over time.
- Logistic Regression
Logistic regression is used for classification for predicting the probability of a binary outcome, meaning it has only two possible outcomes. There are three subtypes of logistic regression.
- Binomial – when the output values are only two.
- Multinomial – when there are multiple unordered output categories.
- Ordinal – when there are multiple ordered categories.
The logistic function is used to map a real value to a 1 or 0.
$$P(Y=1|X) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_n X_n)}}$$
Where P is the probability that the dependent variable Y is 1, based on the provided independent or X value, and β0, β1, …, βn are the coefficients of the model.
A threshold is set for the categorization of the outcomes, for example if the threshold is 0.5, an outcome greater than that belongs to the class with 1s and if the outcome is lower than the threshold, it belongs in the class with 0s.
Logistic regression can be applied to predicting whether a customer will purchase the product or not based on their profile history.
- Decision Trees
It is one of the most powerful tools of machine learning and is used for both regression and classification tasks. It represents a tree-like structure, with branches and nodes where branches are the rules of the algorithm, internal nodes are the labels and leaf nodes are the final output.
At each internal node, a decision is madeto splitg the data into two or more subsets (or two or more possible outcomes) which are partitioned again. It is a recursive process unless it meets a stop criterion.
Decision trees are prone to overfitting especially if the tree has great depth.
Medical conditions of a certain patient can be diagnosed based on their history and symptoms using decision trees.
- Support Vector Machines (SVM)
Support Vector Machines are the machine learning algorithm used for both classification and regression analysis, to specify a hyperplane that separates the data points in a higher dimensional feature space. The hyperplane aims to maximize the distance between itself and data points. A graph is plotted for data points. There are two types of SVMs:
- Linear or Simple SVM – A line can be used to classify linearly separable data.
- Kernel or Non-Linear SVM – New features are added to classify data sets in the 2D space, since a straight line cannot be used to classify the dataset.
SVM can be used to classify new articles under different categories like sports, entertainment, news etc.
- Random Forests
Random forests use multiple decision trees and take the mean of all outputs. In classification, the output is the mode of classification and in regression, it is the mean of the quantitative output.
Random forests are more versatile and robust than just decision trees and they are also less prone to overfitting. It is, however, more time consuming and requires more resources. Three main parameters are set, node size, number of trees, and number of features sampled.
Random forests can be applied to detect fraudulent transactions in financial and banking services.
- K-Nearest Neighbours
KNN is a bit similar to random forests as it considers the class label (for classification) and mean of the output (for regression) to make a decision although an explicit tree or model is not made. The outputs of the nearest neighbours are considered to make the final decision.
The recommender system that suggests movies, videos or news articles based on what we interacted with before is an application of KNN.

Leave a Reply