ruk·si

Neural Networks
Basics

Updated at 2019-01-27 00:35

A neural network is a network or circuit of neurons. Neural networks are frequently used for predictive modeling.

Perceptron is the original artificial neuron developed in 1950s. It's the easiest way to explain what an artificial neuron is. It takes several binary inputs, x1, x2, ... and produces a single binary output. So, the inputs and outputs are always 0 or 1.

Each perceptron input has a weight associated with it. Weights are real numbers expressing the importance of the respective inputs to the output. Perceptron weight can be over 1, which is uncommon in modern neural network neurons. Weight is sometimes called simply "a coefficient".

Perceptron and neuron terms get frequently mixed up. Perceptron is a unit that outputs binary values based on weighted inputs and a threshold. Neuron is a node in backpropagation artificial neural network working on floating point values with various activation functions. But the terms can be used pretty interchangeably.

Perceptron's output is 1 if weighted sum of it's inputs is greater than some threshold value. Otherwise it's 0. In modern networks, threshold is usually 0 and a bias is added the weight calculation for better control.

f(weight * input)        >= threshold       => perceptron fires
or nowadays:
f(weight * input + bias) >= 0               => neuron fires

Neuron bias indicates how easy it is to get it to fire. Bias is also frequently called error, noise or intercept.

Even perceptrons can be used to create NAND logical gate, thus you can use perceptrons to compute any logical function. This means a perceptron neural network can express the same kind of logic that any other modern computing device.

Perceptron with 2 inputs of weight -2, threshold 0 and bias of 3 creates NAND:
(−2)*0 + (−2)*0 + 3 = 3  = perceptron fires
(−2)*1 + (−2)*0 + 3 = 1  = perceptron fires
(−2)*0 + (−2)*1 + 3 = 1  = perceptron fires
(−2)*1 + (−2)*1 + 3 = -1 = perceptron doesn't fire

In f(weight * input + bias), the f() is called the activation function. For example, sigmoid functions is used as activation function of a sigmoid neuron.

CPUs are for latency, GPUs are for parallel operations. Training neural networks is all about floating point number matrix operations, which is the reason GPUs excel in it.

Sources