5.6 Outro

5.6.1 Remarks

We have discussed a multinomial logistic regression model as a generalisation of the binary one.

This in turn is a special case of feed-forward neural networks.

There’s a lot of hype (again…) for deep neural networks in many applications, including vision, self-driving cars, natural language processing, speech recognition etc.

Many different architectures of neural networks and types of units are being considered in theory and in practice, e.g.:

  • convolutional neural networks apply a series of signal (e.g., image) transformations in first layers, they might actually “discover” deskewing automatically etc.;
  • recurrent neural networks can imitate long short-term memory that can be used for speech synthesis and time series prediction.

Main drawbacks of deep neural networks:

  • learning is very slow, especially with very deep architectures (days, weeks);
  • models are not explainable (black boxes) and hard to debug;
  • finding good architectures is more art than science (maybe: more of a craftsmanship even);
  • sometimes using deep neural network is just an excuse for being too lazy to do proper data cleansing and pre-processing.

There are many issues and challenges that will be tackled in more advanced AI/ML courses and books, such as (Goodfellow et al. 2016).

5.6.2 Beyond MNIST

plot of chunk globalsum

plot of chunk globalsum2

The MNIST dataset is a classic, although its use in research is discouraged nowadays – the dataset is not considered challenging anymore – state of the art classifiers can reach \(99.8\%\) accuracy.

See Zalando’s Fashion-MNIST (by Kashif Rasul & Han Xiao) at https://github.com/zalandoresearch/fashion-mnist for a modern replacement.

Alternatively, take a look at CIFAR-10 and CIFAR-100 (https://www.cs.toronto.edu/~kriz/cifar.html) by A. Krizhevsky et al. or at ImageNet (http://image-net.org/index) for an even greater challenge.

5.6.3 Further Reading

Recommended further reading:

  • (James et al. 2017: Chapter 11)
  • (Goodfellow et al. 2016)