How does a neural net really work¶
In this notebook I'm exploring fast.ai's Kaggle notebook on "How does a neural net really work". This relates to Lesson 3 of the fast.ai Deep Learning course. While the video provides a solid explanation, the enigmatic imports and variables can be difficult to comprehend. I'm reimplementing some sections to see if if sticks. In a nutshell, this is what is happening in this notebook:
- Revising Regressions
-
@@ -7563,28 +13136,22 @@ a.anchor-link {
# Installing the dependencies within the notebook to make it easier to run on colab
%pip install -Uqq fastai ipywidgets
1. Revising Regressions¶
1.1 Plot a generic quadratic function ($ax^2+bx+c$)¶
from fastai.basics import torch, plt
# Set the figure DPI to 90 for better resolution
@@ -7641,14 +13198,11 @@ a.anchor-link {
# Function with quadratic expression ax^2 + bx + c
def quad(a, b, c, x): return a*x**2 + b*x + c
@@ -7669,38 +13223,29 @@ a.anchor-link {
1.2. Generate some random data points¶
# Add both multiplicative and additive noise to input x
def add_noise(x, mult, add): return x * (1+torch.randn(x.shape) * mult) + torch.randn(x.shape) * add
@@ -7725,38 +13270,29 @@ a.anchor-link {
1.3. Fit the function to the data¶
In this section, we will explore the step-by-step process to find the values of a
, b
, and c
that allow our function to accurately reflect the random data generated in 1.2
. The interactive plot below, shows how adjustments to a
, b
, and c
influence the function's shape to better align with our data layout.
from ipywidgets import interact
@interact(a=1.5, b=1.5, c=1.5)
def demo_interactive_plot_quad(a, b, c):
@@ -7770,40 +13306,31 @@ a.anchor-link {
1.4 Measure the error¶
This is cool and works, but we need to know how close we are to our ideal solution. In regression, we can use fun ways to estimate this, like the "Mean Absolute Error," which averages the distance between predicted and actual values.
The fastai library has a wrapper for some of the most common methods (from scikit-learn).
Here's a quick demo where we can see how it is calculated.
def mean_absolute_error(preds, acts): return (torch.abs(preds-acts)).mean()
def demo_mean_absolute_error():
@@ -7826,14 +13353,11 @@ a.anchor-link {
@interact(a=1.5, b=1.5, c=1.5)
def plot_quad(a, b, c):
x,y = generate_noisy_data(mk_quad(3,2,1))
@@ -7866,37 +13388,27 @@ Prediction: 4.0, Actual: 4.2, Absolute Difference: 0.200
2. Understand and break down the Gradient Descent algorithm¶
Now that we can calculate the mean absolute error, the next step is to understand how to adjust our parameters a
, b
, and c
to reduce this error. To do this, we can think about the gradients of the error with respect to each of the a,b,c
parameters.
👉 If you were walking on a hill (representing the error surface), the partial derivative with respect to one direction (say, the 'a' direction) tells you the slope of the hill in that specific direction. A steep slope/gradient means a large change in error for a small change in 'a'.
For example, if we consider the partial derivative of the mean absolute error with respect to a
(while keeping b
and c
fixed), a negative value would indicate that increasing a
will lead to a decrease in the error (like walking forward-downhill in the 'a' direction). Conversely, a positive value would suggest that decreasing a
would reduce the error (ackwardly walking backwards-downhill in the 'a' direction 😄).
#########################################
# Example 1: Using x1 = [1.0, 2.0, 3.0]
#########################################
@@ -7986,14 +13495,11 @@ Prediction: 4.0, Actual: 4.2, Absolute Difference: 0.200
Now we can create an interactive plot where we show the gradient on a
, b
and c
.
If the slope is negative we want to move forward (or downhill).
@interact(a=1.5, b=1.5, c=1.5)
def demo_quadratic_plot_with_gradients(a, b, c):
x,y = generate_noisy_data(mk_quad(3,2,1))
@@ -8079,27 +13579,22 @@ If the slope is negative we want to move forward (or downhill).
from fastai.metrics import mae
def demo_auto_fit(steps=50):
@@ -8132,14 +13627,11 @@ If the slope is negative we want to move forward (or downhill).