Linear Regression from Scratch
A tutorial of building and training a linear regression model
About
This notebook is a quick walkthough of Linear Regression inner workings. We will not be using any external Machine Learning library (except for numpy
, which really is a scientific computing libray than a Machine Learning one).
Every piece of the algorithm (right from model, to cost function, until gradinent decent) will be built from scratch.
What We Need
From the outset, as explained in the blog post, we need three important things to build any Machine Learning model
- Data from which We Need to Learn/represent
- Model
- Cost Function
- Optimization Algorithm (we'd be using Gradient Decent here as most production tasks use some or the other variation of this algorithm and as it suits our purpse here)
Part 1: The Data
Your dataset can contain anywhere between a few hundred to few million different datapoints. These datapoints are fed into an Optimization Algorithm multiple times. With each iteration, the Gradient Decent (Optimzation Algorithm), calculates the Output of the model, computes the cost based on that computation and real value (comes from the data itself), and then updates the parameters to minimize the loss.
For our case here, we will be just using 2 data points. x = [10, 20], with corresponding y = [105, 205].
import numpy as np
x_train = np.array([10, 20])
y_train = np.array([105, 205])
# also some test data
x_test = np.array([30, 40])
y_test = np.array([305, 405])
Let's now build the function which will represent our model. Since we are doing linear regression, we will be using: y = wx + b
Here w is a weight and b is a bias term – they're generally referred as model parameters.
> Important: All in all, all we need is the right parameters that will represent the data well.
i.e. Machine Learning is really the process of obtaining the best parameters w and b that fits our data well – also called as training.
Once we have these parameters, we use it to predict unseen data, also called as inference. These two steps are all there is to it. We'll be doing both in this tutorial!
Now, lets build a function that will help us compute the output y, based on model wx + b
def compute_model_output(w, x, b):
# print(f"w: {w}, x: {x}, b: {b}")
model_output = w*x + b
return model_output
Let's assume that our input x is 2. Our learned parameters are w=10, b=5. Thus our output should be 10\2 + 5*, which is 25. Let's verify:
compute_model_output(w=10, x=2, b=5)
As you can guess, as we have y = [105, 205]. when we're done, we'd like to have w=10, and b=5. Let's see how we can arrive at these values!
That's correct! This means our codified model in Python is returning the expected the values. Feel free to try here with your own examples here.
Now, as explained the blog post, we need a way to measure the performance of our learned model.
i.e. we need a way to measure if our model is producing realistic values.
def compute_cost(y_pred, y, m):
# y_pred = shape: (1, m), contains the predicted values by the Model
# y = shape: (1, m), contains the ground truth (in our case, 105, 205)
cost_per_datapoint = (y_pred - y)**2 # this variable contains the cost per datapoint
total_cost = sum(cost_per_datapoint)/(2*m)
return total_cost
def gradient_decent(iterations, w, x, b, y, alpha):
"""
returns the learned parameters w, b and list of cost per iteration
"""
m = len(x)
all_costs = []
for i in range(iterations):
model_prediction = compute_model_output(w, x, b)
cost = compute_cost(y_pred=model_prediction, y=y, m=m)
dw,db = compute_gradients(model_prediction, y, x, m)
w, b = update_parameters(w, b, dw, db, alpha)
all_costs.append(cost)
print_at = iterations/10
if i%print_at == 0: # only print 10 progress costs
print(f"Iteration # {i}, cost: {cost}")
return w, b, all_costs
def compute_gradients(y_pred, y, x, m):
diff_per_example = (y_pred - y)/m
dj_dw = diff_per_example*x
dj_db = diff_per_example
return dj_dw, dj_db
def update_parameters(w, b, dw, db, alpha):
w = w - alpha*sum(dw)
b = b - alpha*sum(db)
return w, b
Now that we have all the componenets, let's train our model using our data!
w_init = 10
b_init = 3
iterations = 1000
alpha = 0.0001
w, b, all_costs = gradient_decent(iterations, w_init, x_train, b_init, y_train, alpha)
print(f"Learned parameters: w: {w}, b: {b}, Expected parameters: w: {10}, b: {5}")
And.. congratulations!, you have now built the linear regression machine learning model from scratch!!