5.3 Artificial Neural Networks

5.3.1 Artificial Neuron


A neuron as a mathematical function:

Source: https://en.wikipedia.org/wiki/File:Neuron3.png by Egm4313.s12 at English Wikipedia, licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license


The perceptron (Frank Rosenblatt, 1958) was amongst the first models of artificial neurons:

5.3.2 Logistic Regression as a Neural Network


The above resembles our binary logistic regression model!

We determine a linear combination (a weighted sum) of 784 inputs and then transform it using the logistic sigmoid “activation” function.


A multiclass logistic regression can be depicted as:


This is an instance of a:

  • single layer (there is only one processing step that consists of 10 units),
  • densely connected (all the inputs are connected to all the neurons),
  • feed-forward (outputs are generated by processing the inputs directly, there are no loops in the graph etc.)

artificial neural network that uses the softmax as the activation function.

5.3.3 Example in R


To train such a neural network (fit a multinomial logistic regression model), we will use the keras package, a wrapper around the state-of-the-art, GPU-enabled TensorFlow library.


Predict over the test set and one-hot-decode the output probabilities:

##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.00  0.00
## [2,] 0.01 0.00 0.85 0.02 0.00 0.02 0.09 0.00 0.01  0.00
## [3,] 0.00 0.95 0.01 0.01 0.00 0.00 0.01 0.01 0.01  0.00
## [4,] 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00  0.00
## [5,] 0.00 0.00 0.01 0.00 0.88 0.00 0.01 0.02 0.01  0.06
## [6,] 0.00 0.97 0.01 0.01 0.00 0.00 0.00 0.01 0.01  0.00
##  [1] 7 2 1 0 4 1 4 9 6 9 0 6 9 0 1 5 9 7 3 4
##  [1] 7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4

Accuracy on the test set:

## [1] 0.9081

Performance metrics for each digit separately:

i Acc Prec Rec F TN FN FP TP
0 0.9915 0.9365854 0.9795918 0.9576060 8955 20 65 960
1 0.9919 0.9582609 0.9709251 0.9645514 8817 33 48 1102
2 0.9787 0.9191402 0.8701550 0.8939771 8889 134 79 898
3 0.9794 0.8925781 0.9049505 0.8987217 8880 96 110 914
4 0.9813 0.8947368 0.9175153 0.9059829 8912 81 106 901
5 0.9774 0.9193955 0.8183857 0.8659549 9044 162 64 730
6 0.9873 0.9235474 0.9457203 0.9345023 8967 52 75 906
7 0.9812 0.9117647 0.9046693 0.9082031 8882 98 90 930
8 0.9716 0.8429423 0.8706366 0.8565657 8868 126 158 848
9 0.9759 0.8779528 0.8840436 0.8809877 8867 117 124 892

Note how misleading the individual accuracies are! Averages:

##       Acc      Prec       Rec         F 
## 0.9816200 0.9076904 0.9066593 0.9067053

plot of chunk logistic6