The fundamentals of Simple Linear Regression

Afaque Umer
5 min readMar 21, 2022

Linear regression is the most fundamental statistical concept as well as the most basic machine learning method. It provides a model of the relationship between the magnitude of one variable Y and that of a second X by estimating how much Y will change when X changes by a certain amount.


How is sales volume affected by the weather?

How does sales volume change with changes in price?

How does the amount of a drug absorbed vary with the patient’s body weight?

The Equation

The general form of a linear equation with one independent variable can be written as y = b0 + b1x.

X is also known as the predictor variable | independent variable | input feature | attribute

Y is known as the response variable | dependent variable | target variable

The weight value(b1) represents the mean change in the response given a one-unit change in the predictor. For example, if weight is +5, the mean response value increases by 5 for every one unit change in the predictor.

Each model is determined solely by its parameters (the weights and the biases) That is why we are interested in varying them using the (kind of) trial and error method until we find a model that explains the data sufficiently well i.e finding the line that fits the data best.

There are three ways:

  1. Keeping the intercept constant and varying the slope. This will result in the rotation of the line making pivot at the point of intercept.
  2. Keeping the slope constant and varying the intercept, will result in translational motion of the line along the y-axis.
  3. Varying both slope and the intercept (translational +rotational) of the line.

The optimization algorithm will vary the weights and the biases, in order to produce the model that fits the data best.

Cost Function

We feed the model with a target variable(Supervised Machine Learning). So from the input data, we get the “y”. we use “y-hat’’ to denote the value predicted by a line for a value of x also known as fitted values. We compute the errors by subtracting the predicted values from the original values.

Our objective is to get the best possible line. The best possible line will be such so that the average squared vertical distances of the scattered points from the line will be the least. The method of minimizing the sum of the squared residuals is termed least squares regression or ordinary least squares (OLS) regression.

The least-squares criterion is that the line that best fits a set of data points is the one having the smallest possible sum of squared errors.

Ideally, the line should pass through all the points of our training data set, in such a scenario, the value of the cost function will be 0. Thus as a goal, we should try to minimize the cost function and should find our global minimum.

If the first derivative is positive, then we are on the right side of the parabola. If the first derivative is negative, then we are on the left side of the parabola. If the first derivative is 0 we have found our global minima.


Let’s take an example to understand the theory we have learned above. So we take a simple and small data set of employees where the predictors are the years of experience and the target variables are the salary of employees.

Our goal is to fit a model with OLS so that when a person enters the value of the predictor(x) he will be getting a response(y) from the model so that the user can have an idea that with this amount of experience I’ll have around this much of salary.

Since this is a minimization problem our goal is to make the cost function zero. We will fit the above data in the equation and we will get something like this:

Since we have two variables b0 and b1, so we will be taking partial derivatives of each and equating them to zero, by doing so we will be getting two equations in the end.

Hence our final predictor equation i.e “y-hat” will be:

y = .04147(x) + 4.0728

Now we can predict the salary(y) by providing the years of experience(x).

Getting started with Machine learning

Now that we have understood the concept behind the OLS and Simple Linear Regression. We will take the same data and try to find out the outcomes using python.

Step 1: Import the required libraries.

Step 2: Create the data frame.

Step 3: Fit the data into the model.

Step 4: Evaluate the results.

This concludes our simple linear regression. Now we know the maths behind it and how to perform it using python 🤓

If you have any questions or suggestions feel free to connect. I will try to bring up more machine learning concepts and will try to break down fancy-sounding terms and concepts into simpler ones.

Thanks for reading 😃 🙏🏻



Afaque Umer

AI whisperer, unraveling the secrets of the universe one byte at a time. Let's geek out together 👉