From Regression to Clustering to CNNs: A Brief Guide to 25+ Machine Learning Models

Machine learning is at the core of modern AI, powering everything from recommendation systems to self-driving cars. But behind every intelligent application lies foundational models that make it all possible. This article provides a concise yet comprehensive breakdown of key machine learning models to help you confidently ace your technical interviews.

Linear Regression

Linear Regression tries to find a relationship between independent and dependent variables by finding a “best-fitted line” that has minimal distance from all the data points using the least square method. The least square method finds a linear equation that minimizes the sum of squared residuals(SSR).

For example, the green line below is a better fit than the blue line because it has a minimal distance from all data points.

Lasso Regression (L1)

Lasso regression is a regularization technique to reduce overfitting by introducing some amount of bias in the model. It does this by minimizing the squared difference of residual with the addition of a penalty, where the penalty is equal to lambda times the absolute value of the slope. Lambda refers to the severity of the penalty. It works as a hyperparameter that can be changed to reduce overfitting and produce a better fit.

Figure 2: Cost Function Lasso Regression

L1 Regularization is a preferred choice when we have a large number of features because it ignores all the variables where the slope value is much less.

Figure 3: Graph Showing Effect of Regularization On Overfitted Regression Line

Ridge Regression (L2)

Ridge regression is similar to lasso regression. The only difference between the two is the calculation of the penalty term. It adds a penalty term equivalent to the square of the magnitude times lambda.

Figure 4: Cost Function Ridge Regression

L2 Regularization is best to use when our data suffers from multicollinearity (independent variables are highly correlated) because it shrinks all the coefficients toward zero.

Elastic Net Regression

Elastic Net Regression combines the penalties from both the lasso and ridge regression to provide a more regularized model.

It allows a balance of both penalties, which results in a better-performing model in comparison to using either l1 or l2 alone.

Polynomial Regression

It models the relationship between the dependent and independent variables as the nᵗʰ degree polynomial. The polynomials are the sum of terms in the form of k.xⁿ, where n is a non-negative integer, k is a constant and x is an independent variable. It is used for non-linear data.

Logistic Regression

Logistic Regression is a classification technique that tries to find the best-fitted curve for data. It makes use of the sigmoid function to convert the output between the range 0 and 1. Unlike linear regression where the best-fitted line is found using the least square method, logistic regression makes use of Maximum Likelihood Estimation (MLE) to find the best-fitted line (curve).

K-Nearest Neighbours (KNN)

KNN is a classification algorithm that classifies new data points based on their distance from the nearest classified points. It assumes that data points exits in close proximity to each other are highly similar.

KNN algorithm is also referred to as a lazy learner because it stores the training data and doesn’t classify it into different classes until a new data point occurs for prediction.

By default KNN makes use of Euclidean distance to find the closest classified points for new data, the mode of closest classes is taken to find the predicted class for a new data point.

if the value of k is set to low then a new data point might be considered as an outlier However if it is too high then it may overlook classes with few samples.

How to Explain Each Core Machine Learning Model in an Interview