5.1. Transpose, Inverse, and Norm#

5.1.1. Transpose#

The transpose of a matrix is the operation that swaps the rows and columns. For example, consider the \(4\times 3\) matrix \(A\) below

(5.1)#\[\begin{equation} A = \begin{bmatrix} 1 & 2 & 3 \\ 40 & 50 & 60 \\ 8 & 9 & 10 \\ .11 & .12 & .13 \end{bmatrix} \end{equation} \]

The transpose of \(A\), denoted with a superscript \(T\), is

(5.2)#\[\begin{equation} A^T = \begin{bmatrix} 1 & 40 & 8 & .11 \\ 2 & 50 & 9 & .12 \\ 3 & 60 & 10 & .13 \end{bmatrix} \end{equation} \]

Notice that \(A^T\) is now a \(3\times 4\) matrix. In general, if \(A\) is an \(m\times n\) matrix, then \(A^T\) is an \(n\times m\) matrix.

In index notation, the transpose operation swaps the indices:

(5.3)#\[\begin{equation} (A^T)_{ij} = A_{ji} \end{equation}\]

This means that the element in the \(i\)-th row and \(j\)-th column of \(A^T\) equals the element in the \(j\)-th row and \(i\)-th column of the original matrix \(A\).

5.1.1.1. Properties of Matrix Transpose#

The transpose operation has several important properties:

  1. Double transpose: \((A^T)^T = A\)

  2. Transpose of a sum: \((A + B)^T = A^T + B^T\)

  3. Transpose of a scalar multiple: \((cA)^T = cA^T\) for any scalar \(c\)

  4. Transpose of a product: \((AB)^T = B^T A^T\) (note the order reversal)

Property 4 can be proven using index notation. Let \(C\) be the product of \(A\) and \(B\)

(5.4)#\[\begin{equation} C_{ij} = \displaystyle \sum_k A_{ik} B_{kj}. \end{equation} \]

Taking the transpose, i.e., swapping the indices, gives

(5.5)#\[\begin{equation} C^T_{ij} = C_{ji} = \displaystyle \sum_k A_{jk} B_{ki} = \displaystyle \sum_k B_{ki} A_{jk} = \displaystyle \sum_k B^T_{ik} A^T_{kj} \end{equation}\]

By definition of matrix multiplication, this last expression is exactly \((B^T A^T)_{ij}\). Therefore:

(5.6)#\[\begin{equation} C^T_{ij} = (B^T A^T)_{ij} \end{equation}\]

Since this holds for all indices \(i\) and \(j\), we conclude that

(5.7)#\[\begin{equation} (AB)^T = B^T A^T. \end{equation}\]

Exercise Use NumPy to verify property 4 above (a justification by numerical experiment). Create two \(3\times3\) arrays \(A\) and \(B\). Use the numpy random integer function to create arrays with numbers between 0 and 10. Compute the product \(C = AB\). Compute the transpose of \(C\) by directly taking the transpose and also by computing \(B^T A^T\). Verify that both methods give the same result.

import numpy as np

# Set random seed for reproducibility
np.random.seed(42)

# Create two 3x3 arrays with random integers between 0 and 10
A = np.random.randint(0, 11, size=(3, 3))
B = np.random.randint(0, 11, size=(3, 3))

print("Matrix A:")
print(A)
print("\nMatrix B:")
print(B)

# Compute the product C = AB
C = A @ B  # or np.dot(A, B)
print("\nMatrix C = AB:")
print(C)

# Method 1: Direct transpose of C
C_transpose_direct = C.T
print("\nC^T (direct transpose):")
print(C_transpose_direct)

# Method 2: Compute B^T A^T
B_transpose = B.T
A_transpose = A.T
C_transpose_formula = B_transpose @ A_transpose

print("\nB^T:")
print(B_transpose)
print("\nA^T:")
print(A_transpose)
print("\nB^T A^T:")
print(C_transpose_formula)

# Verify that both methods give the same result
are_equal = np.allclose(C_transpose_direct, C_transpose_formula)
print(f"\nAre C^T and B^T A^T equal? {are_equal}")

# Show the difference (should be all zeros or very close to zero)
difference = C_transpose_direct - C_transpose_formula
print("\nDifference (should be all zeros):")
print(difference)

This exercise demonstrates property 4: \((AB)^T = B^T A^T\) through numerical computation. The np.allclose() function is used to check equality, which accounts for potential floating-point precision issues.


5.1.2. Symmetric Matrix#

A symmetric matrix is one in which the transpose is equal to itself. In other words, a matrix \(A\) is symmetric if:

(5.8)#\[\begin{equation} A = A^T \end{equation}\]

This means that \(A_{ij} = A_{ji}\) for all indices \(i\) and \(j\). In geometric terms, the matrix is “mirrored” across its main diagonal.

Examples:

A \(3 \times 3\) symmetric matrix:

(5.9)#\[\begin{equation} A = \begin{bmatrix} 2 & 5 & -1 \\ 5 & 3 & 7 \\ -1 & 7 & 4 \end{bmatrix} \end{equation}\]

Notice that \(A_{12} = A_{21} = 5\), \(A_{13} = A_{31} = -1\), and \(A_{23} = A_{32} = 7\).

Properties of Symmetric Matrices:

  • Only square matrices can be symmetric

  • The diagonal elements can be any values

  • All eigenvalues of a real symmetric matrix are real

  • Symmetric matrices are important in many applications (e.g., covariance matrices, quadratic forms)

Exercise Use NumPy to create a \(4 \times 4\) symmetric matrix and verify that it equals its own transpose. Also check a non-symmetric matrix to confirm the difference.


5.1.3. Identity Matrix and Matrix Inverse#

The analog of the number 1 in matrix calculations is the identity matrix, a square matrix of all zeros except ones along the main diagonal. For example, the \(3\times3\) identity matrix is

(5.10)#\[\begin{align} \mathbf{I}_3 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} . \end{align} \]

In general, the \(n \times n\) identity matrix \(\mathbf{I}_n\) has the property that \(\mathbf{I}_n A = A \mathbf{I}_n = A\) for any \(n \times n\) matrix \(A\), just like multiplying a number by 1 leaves it unchanged.

Matrix Inverse

There is no matrix division operation (dividing by a matrix). However, there is the matrix inverse, defined by the relationship \(A^{-1}A = AA^{-1} = \mathbf{I}\), where \(A^{-1}\) is the inverse of \(A\). This is analogous to the notation for numbers, e.g., \(3^{-1} \cdot 3 = 1\).

Important notes about matrix inverses:

  • Only square matrices can have inverses

  • Not all square matrices have inverses (they must be non-singular or invertible)

  • If \(A^{-1}\) exists, it is unique

  • \((A^{-1})^{-1} = A\)

  • \((AB)^{-1} = B^{-1}A^{-1}\) (note the order reversal, similar to transpose)

Example

Consider the \(2 \times 2\) matrix

(5.11)#\[\begin{align} A = \begin{bmatrix} 2 & 1 \\ 1 & 1 \end{bmatrix}. \end{align}\]

Its inverse is

(5.12)#\[\begin{align} A^{-1} = \begin{bmatrix} 1 & -1 \\ -1 & 2 \end{bmatrix}. \end{align}\]

We can verify this by computing the product

(5.13)#\[\begin{align} A^{-1}A &= \begin{bmatrix} 1 & -1 \\ -1 & 2 \end{bmatrix} \begin{bmatrix} 2 & 1 \\ 1 & 1 \end{bmatrix} \\ &= \begin{bmatrix} 1(2) + (-1)(1) & 1(1) + (-1)(1) \\ (-1)(2) + 2(1) & (-1)(1) + 2(1) \end{bmatrix} \\ &= \begin{bmatrix} 2 - 1 & 1 - 1 \\ -2 + 2 & -1 + 2 \end{bmatrix} \\ &= \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = \mathbf{I}_2. \end{align}\]

This confirms that \(A^{-1}\) is the inverse of \(A\).

Exercise Use NumPy to create a \(3\times3\) matrix that is guaranteed to be invertible by starting with the identity matrix and adding small random values. Compute the inverse via np.linalg.inv(). Verify that multiplying the original matrix by the inverse results in the identity matrix.

Steps:

  1. Create a \(3 \times 3\) identity matrix using np.eye(3)

  2. Add small random values (e.g., 0.1 times a random matrix) to ensure invertibility

  3. Compute the matrix inverse

  4. Verify that \(A \cdot A^{-1} = I\) and \(A^{-1} \cdot A = I\)

  5. Check that the results are close to the identity matrix using np.allclose()

This method guarantees an invertible matrix because we start with the identity matrix (which has determinant 1) and add only small perturbations that won’t make the determinant zero.


5.1.4. Norms#

The distance between two points is a well-known example of a vector norm; it is called the Euclidean or \(L_2\) norm. For example, in 3-dimensional space, consider two points \((x_1, y_1, z_1)\) and \((x_2, y_2, z_2)\). The Euclidean distance between them is

(5.14)#\[\begin{equation} \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2 + (z_1 - z_2)^2} . \end{equation} \]

A norm is a measure of the size (or magnitude) of a vector. There are multiple methods to calculate a norm. The p-norm of a vector \(\mathbf{x} = [x_1, x_2, \ldots, x_n]^T\) is defined by

(5.15)#\[\begin{equation} \|\mathbf{x}\|_p = \left( \sum_{i=1}^n | x_i |^p \right)^{1/p}. \end{equation} \]

Common Vector Norms

  • \(L_1\) norm (\(p=1\)): \(\|\mathbf{x}\|_1 = \sum_{i=1}^n |x_i|\) (Manhattan distance)

  • \(L_2\) norm (\(p=2\)): \(\|\mathbf{x}\|_2 = \sqrt{\sum_{i=1}^n x_i^2}\) (Euclidean norm)

  • \(L_\infty\) norm (\(p=\infty\)): \(\|\mathbf{x}\|_\infty = \max_i |x_i|\) (Maximum norm)

Definition of a Norm

A norm on a vector space \(V\) is a function \(\| \cdot \| : V \rightarrow \mathbb{R}\) satisfying:

  1. Positive definiteness: \(\| \mathbf{x} \| > 0\) if \(\mathbf{x} \neq \mathbf{0}\), and \(\| \mathbf{0} \| = 0\)

  2. Homogeneity: \(\|\lambda \mathbf{x} \| = |\lambda| \| \mathbf{x} \|\) for all \(\lambda \in \mathbb{R}\) and \(\mathbf{x} \in V\)

  3. Triangle inequality: \(\| \mathbf{x} + \mathbf{y}\| \leq \|\mathbf{x}\| + \|\mathbf{y}\|\) for all \(\mathbf{x}, \mathbf{y} \in V\)

Exercise Verify that if \(p=2\) in the p-norm formula, you recover the Euclidean norm. Show this both algebraically and with a numerical example using a 3D vector.

Expected Output:

Vector x = [3 4 5]
P-norm with p=2: 7.071068
Euclidean norm:   7.071068
NumPy L2 norm:    7.071068
All methods equal? True
Expected result:  7.071068

This confirms that the p-norm with \(p=2\) is indeed the Euclidean norm.