a.k.a. == also known as

w.r.t. == with respect to

s.t. == such that

iff == if and only if

e.g. == for example (Latin: exempli gratia)

i.e. == that is (Latin: id est)

etc. == and so forth (Latin: et cetera)

AI == artificial intelligence

GA == genetic algorithm

GD == gradient descent

GLM == generalised linear model

ML == machine learning

NN == neural network

SGD == stochastic gradient descent

IDE = integrated development environment

Notation Convention – Logic and Set Theory

\(\forall\) – for all

\(\exists\) – exists

By writing \(x \in \{a, b, c\}\) we mean that “\(x\) is in a set that consists of \(a\), \(b\) and \(c\)” or “\(x\) is either \(a\), \(b\) or \(c\)

\(A\subseteq B\) – set \(A\) is a subset of set \(B\) (every element in \(A\) belongs to \(B\), \(x\in A\) implies that \(x\in B\))

\(A\cup B\) – union (sum) of two sets, \(x\in A\cup B\) iff \(x\in A\) or \(x\in B\)

\(A\cap B\) – intersection (sum) of two sets, \(x\in A\cap B\) iff \(x\in A\) and \(x\in B\)

\(A\setminus B\) – difference of two sets, \(x\in A\setminus B\) iff \(x\in A\) and \(x\not\in B\)

\(A\times B\) – Cartesian product of two sets, \(A\times B = \{ (a,b): a\in A, b\in B \}\)

\(A^p = A\times A \times \dots\times A\) (\(p\) times) for any \(p\)

Notation Convention – Symbols

\(\mathbf{X,Y,A,I,C}\) – bold (I use it for denoting vectors and matrices)

\(\mathbb{X,Y,A,I,C}\) – blackboard bold (I sometimes use it for sets)

\(\mathcal{X,Y,A,I,C}\) – calligraphic (I use it for set families = sets of sets)

\(X, x, \mathbf{X}, \mathbf{x}\) – inputs (usually)

\(Y, y, \mathbf{Y}, \mathbf{y}\) – outputs

\(\hat{Y}, \hat{y}, \hat{\mathbf{Y}}, \hat{\mathbf{y}}\) – predicted outputs (usually)

  • \(X\) – independent/explanatory/predictor variable

  • \(Y\) – dependent/response/predicted variable

\(\mathbb{R}\) – the set of real numbers, \(\mathbb{R}=(-\infty, \infty)\)

\(\mathbb{N}\) – the set of natural numbers, \(\mathbb{N}=\{1,2,3,\dots\}\)

\(\mathbb{N}_0\) – the set of natural numbers including zero, \(\mathbb{N}_0=\mathbb{N}\cup\{0\}\)

\(\mathbb{Z}\) – the set of integer numbers, \(\mathbb{Z}=\{\dots,-2,-1,0,1,2,\dots\}\)

\([0,1]\) – the unit interval

\((a, b)\) – an open interval; \(x\in(a,b)\) iff \(a < x < b\) for some \(a< b\)

\([a, b]\) – a closed interval; \(x\in[a,b]\) iff \(a \le x \le b\) for some \(a\le b\)

Notation Convention – Vectors and Matrices

\(\boldsymbol{x}=(x_1,\dots,x_n)\) – a sequence of \(n\) elements (\(n\)-ary sequence/vector)

if it consists of real numbers, we write \(\boldsymbol{x}\in\mathbb{R}^n\)

\(\mathbf{x}=[x_1\ x_2\ \dots\ x_p]\) – a row vector, \(\mathbf{x}\in\mathbb{R}^{1\times p}\) (a matrix with 1 row)

\(\mathbf{x}=[x_1\ x_2\ \dots\ x_n]^T\) – a column vector, \(\mathbf{x}\in\mathbb{R}^{n\times 1}\) (a matrix with 1 column)

\(\mathbf{X}\in\mathbb{R}^{n\times p}\) – matrix with \(n\) rows and \(p\) columns

\[ \mathbf{X}= \left[ \begin{array}{cccc} x_{1,1} & x_{1,2} & \cdots & x_{1,p} \\ x_{2,1} & x_{2,2} & \cdots & x_{2,p} \\ \vdots & \vdots & \ddots & \vdots \\ x_{n,1} & x_{n,2} & \cdots & x_{n,p} \\ \end{array} \right] \]

\(x_{i,j}\) – element in the \(i\)-th row, \(j\)-th column

\(\mathbf{x}_{i,\cdot}\) – the \(i\)-th row of \(\mathbf{X}\)

\(\mathbf{x}_{\cdot,j}\) – the \(j\)-th column of \(\mathbf{X}\)

\[ \mathbf{X}=\left[ \begin{array}{c} \mathbf{x}_{1,\cdot} \\ \mathbf{x}_{2,\cdot} \\ \vdots\\ \mathbf{x}_{n,\cdot} \\ \end{array} \right] = \left[ \begin{array}{cccc} \mathbf{x}_{\cdot,1} & \mathbf{x}_{\cdot,2} & \cdots & \mathbf{x}_{\cdot,p} \\ \end{array} \right]. \]

\[ \mathbf{x}_{i,\cdot} = \left[ \begin{array}{cccc} x_{i,1} & x_{i,2} & \cdots & x_{i,p} \\ \end{array} \right]. \]

\[ \mathbf{x}_{\cdot,j} = \left[ \begin{array}{cccc} x_{1,j} & x_{2,j} & \cdots & x_{n,j} \\ \end{array} \right]^T=\left[ \begin{array}{c} {x}_{1,j} \\ {x}_{2,j} \\ \vdots\\ {x}_{n,j} \\ \end{array} \right], \]

\({}^T\) denotes the matrix transpose; \(\mathbf{B}=\mathbf{A}^T\) is a matrix such that \(b_{i,j}=a_{j,i}\).

\(\|\boldsymbol{x}\| = \|\boldsymbol{x}\|_2 = \sqrt{ \sum_{i=1}^n x_i^2 }\) – the Euclidean norm

Notation Convention – Functions

\(f:X\to Y\) means that \(f\) is a function mapping inputs from set \(X\) (the domain of \(f\)) to elements of \(Y\) (the codomain)

\(x\mapsto x^2\) denotes a (inline) function mapping \(x\) to \(x^2\), equivalent to function(x) x^2 in R

\(\exp x = e^x\) – exponential function with base \(e\simeq 2.718\)

\(\log x\) – natural logarithm (base \(e\))

it holds \(e^x = y\) iff \(\log y = x\)

\(\log ab = \log a + \log b\)

\(\log a^c = c\log a\)

\(\log a/b = \log a - \log b\)

\(\log 1 = 0\)

\(\log e = 1\)

hence \(\log e^x = x\)

Notation Convention – Sums and Products

\(\sum_{i=1}^n x_i=x_1+x_2+\dots+x_n\)

\(\sum_{i=1,\dots,n} x_i\) – the same

\(\sum_{i\in\{1,\dots,n\}} x_i\) – the same

note display (stand-alone) \(\displaystyle\sum_{i=1}^n x_i\) vs text (in-line) \(\textstyle\sum_{i=1}^n x_i\) style

\(\prod_{i=1}^n x_i = x_1 x_2 \dots x_n\)