All Posts

Neural Networks Explained: From Neurons to Deep Learning

Abstract AlgorithmsAbstract Algorithms
··5 min read

TL;DR

Neural Networks are the engine behind the modern AI revolution (like ChatGPT and Midjourney).

Cover Image for Neural Networks Explained: From Neurons to Deep Learning

Introduction: Mimicking the Brain

Traditional algorithms (like Linear Regression) are great for math, but they struggle with "human" tasks like recognizing a face or understanding a joke. To solve these, scientists looked at the best learning machine in the universe: the human brain.

The brain isn't one giant processor; it's a web of billions of tiny cells called neurons connected together. A single neuron is simple, but billions of them working together can write poetry. Artificial Neural Networks (ANNs) try to copy this structure in code.


1. The Artificial Neuron (The Perceptron)

Let's zoom in on a single neuron. In code, it's just a math function.

The Components:

  1. Inputs ($x$): The data coming in (e.g., pixels of an image).
  2. Weights ($w$): How important each input is. (e.g., Is this pixel part of an eye? High weight. Is it background? Low weight.)
  3. Bias ($b$): An extra nudge to help the neuron activate (like a threshold).
  4. Activation Function: The decision maker. It decides if the neuron should "fire" (output a 1) or stay quiet (output a 0).

Deep Dive: Inside a Single Neuron (The "Concert" Example)

Let's trace the math of a single neuron deciding: "Should I go to the concert?"

Inputs ($x$):

  1. Is the band good? (Score 1-10): Let's say 8.
  2. Is it raining? (1 = Yes, 0 = No): Let's say 1 (It's raining).

Weights ($w$) - The Neuron's Personality:

  1. Weight for Band: 0.5 (I like music).
  2. Weight for Rain: -2.0 (I hate getting wet).

Bias ($b$):

  • Bias: 1.0 (I generally like going out).

The Calculation (Weighted Sum): $$ Z = (x_1 \cdot w_1) + (x_2 \cdot w_2) + b $$ $$ Z = (8 \cdot 0.5) + (1 \cdot -2.0) + 1.0 $$ $$ Z = 4 - 2 + 1 = \mathbf{3} $$

The Activation (ReLU): We use a simple rule: If $Z > 0$, output $Z$. If $Z \le 0$, output 0.

  • Since $3 > 0$, the neuron fires! Output = 3 (High excitement).

What if it was raining harder? If the rain weight was -5.0: $$ Z = (8 \cdot 0.5) + (1 \cdot -5.0) + 1.0 = 4 - 5 + 1 = \mathbf{0} $$ The neuron stays quiet. You stay home.


2. Building the Network: Layers

One neuron can't do much. But when we stack them, magic happens.

The Architecture:

  1. Input Layer: Receives the raw data (e.g., the 784 pixels of a handwritten number).
  2. Hidden Layers: The "magic" middle layers. They don't see the input or the output directly. They process features.
    • Layer 1 might detect edges.
    • Layer 2 might detect shapes (circles, lines).
    • Layer 3 might detect objects (eyes, nose).
  3. Output Layer: Gives the final answer (e.g., "This is a Cat").

Deep Learning simply means a neural network with many hidden layers. The "deeper" the network, the more complex patterns it can learn.


3. Feedforward Neural Networks (FNN)

The simplest type of neural network is the Feedforward Neural Network.

How it works: Information flows in one direction only: forward.

  • Input -> Hidden Layer 1 -> Hidden Layer 2 -> Output.
  • There are no loops or cycles. The output of one layer becomes the input of the next.

Analogy: An assembly line. The raw material (data) moves from station to station, getting processed at each step, until the final product (prediction) rolls off the end.


4. How It Learns: Gradient Descent & Backpropagation

A neural network starts out stupid. It has random weights and makes random guesses. How does it get smart?

Gradient Descent (The Strategy)

Imagine you are on top of a mountain (high error) and want to get to the bottom (low error) in the dark. You feel the slope under your feet and take a step downhill.

  • Gradient: The slope (direction of steepest increase).
  • Descent: Moving opposite to the gradient (downhill).

Backpropagation (The Algorithm)

This is the specific algorithm used to calculate that gradient for every single weight in the network.

The Learning Loop (Step-by-Step):

  1. Forward Pass (The Guess): The network takes an input (image of a dog) and guesses "Cat" (Wrong!).

  2. Loss Calculation (The Score): We measure how wrong it was.

    • Loss: High error.
  3. Backward Pass (The Blame Game): We go backward from the output to the input.

    • "Output Neuron, why did you say Cat? Because Hidden Neuron 5 told me to."
    • "Hidden Neuron 5, why were you active? Because Weight 3 was too high."
    • We calculate the Gradient for every weight: "If we increase this weight, does error go up or down?"
  4. Weight Update (The Fix): We nudge every weight slightly in the opposite direction of the gradient to reduce the error.

  5. Repeat: Do this millions of times until the error is zero.


Summary & Key Takeaways

  • Neurons: Simple math functions ($wx + b$) that make small decisions.
  • Layers: Stacks of neurons that learn increasingly complex features.
  • Feedforward: Data flows one way (Input -> Output).
  • Backpropagation: The method of learning from mistakes by propagating the error backward to adjust weights.

What's Next?

Now that we understand the basic building blocks, we can look at specialized architectures. In the next post, we'll explore CNNs (for images), RNNs (for text), and the revolutionary Transformers that power ChatGPT.

Ready to see the tech behind the hype? Subscribe to the series!

Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms