Linear Transformations

5.2. Linear Transformations#

5.2.1. What is a Linear Transformation?#

A linear transformation takes an input vector and produces an output vector by applying a matrix multiplication followed by adding a bias vector. In machine learning, we use weights (\(W\)) and biases (\(\mathbf{b}\)) to transform data from one space to another.

For an input vector \(\mathbf{x}\) in \(\mathbb{R}^n\), a linear transformation produces an output vector \(\mathbf{y}\) in \(\mathbb{R}^m\) using the equation

(5.18)#\[\begin{equation} \mathbf{y} = W\mathbf{x} + \mathbf{b}, \end{equation}\]

where \(W\) is an \(m \times n\) weight matrix, \(\mathbf{b}\) is an \(m\)-dimensional bias vector, \(\mathbf{x}\) is an \(n\)-dimensional input vector, and \(\mathbf{y}\) is an \(m\)-dimensional output vector.

Example 1: 2D to 1D Transformation: Consider transforming a 2D input to a 1D output. If we have input \(\mathbf{x} = [x_1, x_2]^T\) and want output \(y\), we use

(5.19)#\[\begin{equation} y = w_1 x_1 + w_2 x_2 + b. \end{equation}\]

Exercise

If a 4-element array is linearly transformed to a 2-element array, what are the shapes of the matrix \(W\) and vector \(\mathbf{b}\)?