5.4 Deep Neural Networks

5.4.1 Introduction


In a brain, a neuron’s output is an input to another neuron.

We could try aligning neurons into many interconnected layers.

5.4.2 Activation Functions


Each layer’s outputs should be transformed by some non-linear activation function. Otherwise, we’d end up with linear combinations of linear combinations, which are linear combinations themselves.

Example activation functions that can be used in hidden (inner) layers:

  • relu – The rectified linear unit: \[\psi(t)=\max(t, 0),\]
  • sigmoid – The logistic sigmoid: \[\phi(t)=1 / (1 + \exp(-t)),\]
  • tanh – The hyperbolic function: \[\mathrm{tanh}(t) = (\exp(t) - \exp(-t)) / (\exp(t) + \exp(-t)).\]

There is not much difference between them, but some might be more convenient to handle numerically than the others, depending on the implementation.

5.4.3 Example in R - 2 Layers


2-layer Neural Network 784-800-10

## [1] 0.943

Performance metrics for each digit separately:

i Acc Prec Rec F TN FN FP TP
0 0.9941 0.9591226 0.9816327 0.9702471 8979 18 41 962
1 0.9951 0.9729965 0.9841410 0.9785370 8834 18 31 1117
2 0.9871 0.9379243 0.9370155 0.9374697 8904 65 64 967
3 0.9862 0.9404040 0.9217822 0.9310000 8931 79 59 931
4 0.9890 0.9386318 0.9501018 0.9443320 8957 49 61 933
5 0.9862 0.9207589 0.9248879 0.9228188 9037 67 71 825
6 0.9898 0.9439834 0.9498956 0.9469303 8988 48 54 910
7 0.9877 0.9440628 0.9357977 0.9399121 8915 66 57 962
8 0.9858 0.9434968 0.9086242 0.9257322 8973 89 53 885
9 0.9850 0.9223206 0.9296333 0.9259625 8912 71 79 938

plot of chunk deep23

5.4.4 Example in R - 6 Layers


6-layer Deep Neural Network 784-2500-2000-1500-1000-500-10

## [1] 0.973

Performance metrics for each digit separately:

i Acc Prec Rec F TN FN FP TP
0 0.9963 0.9748238 0.9877551 0.9812468 8995 12 25 968
1 0.9969 0.9876325 0.9850220 0.9863255 8851 17 14 1118
2 0.9942 0.9831349 0.9602713 0.9715686 8951 41 17 991
3 0.9941 0.9567723 0.9861386 0.9712335 8945 14 45 996
4 0.9953 0.9804728 0.9714868 0.9759591 8999 28 19 954
5 0.9948 0.9677060 0.9742152 0.9709497 9079 23 29 869
6 0.9956 0.9750520 0.9791232 0.9770833 9018 20 24 938
7 0.9935 0.9868554 0.9494163 0.9677739 8959 52 13 976
8 0.9928 0.9659091 0.9599589 0.9629248 8993 39 33 935
9 0.9925 0.9507722 0.9762141 0.9633252 8940 24 51 985

plot of chunk deep63