Transpose, Inverse, and Norm

5.1. Transpose, Inverse, and Norm#

5.1.1. Transpose#

The transpose of a matrix is the operation that swaps the rows and columns. For example, consider the \(4\times 3\) matrix \(A\) below

(5.1)#\[\begin{equation} A = \begin{bmatrix} 1 & 2 & 3 \\ 40 & 50 & 60 \\ 8 & 9 & 10 \\ .11 & .12 & .13 \end{bmatrix} \end{equation} \]

The transpose of \(A\), denoted with a superscript \(T\), is

(5.2)#\[\begin{equation} A^T = \begin{bmatrix} 1 & 40 & 8 & .11 \\ 2 & 50 & 9 & .12 \\ 3 & 60 & 10 & .13 \end{bmatrix} \end{equation} \]

Notice that \(A^T\) is now a \(3\times 4\) matrix. In general, if \(A\) is an \(m\times n\) matrix, then \(A^T\) is an \(n\times m\) matrix.

In index notation, the transpose operation swaps the indices:

(5.3)#\[\begin{equation} (A^T)_{ij} = A_{ji} \end{equation}\]

This means that the element in the \(i\)-th row and \(j\)-th column of \(A^T\) equals the element in the \(j\)-th row and \(i\)-th column of the original matrix \(A\).

5.1.1.1. Properties of Matrix Transpose#

The transpose operation has several important properties:

Double transpose: \((A^T)^T = A\)
Transpose of a sum: \((A + B)^T = A^T + B^T\)
Transpose of a scalar multiple: \((cA)^T = cA^T\) for any scalar \(c\)
Transpose of a product: \((AB)^T = B^T A^T\) (note the order reversal)

Property 4 can be proven using index notation. Let \(C\) be the product of \(A\) and \(B\)

(5.4)#\[\begin{equation} C_{ij} = \displaystyle \sum_k A_{ik} B_{kj}. \end{equation} \]

Taking the transpose, i.e., swapping the indices, gives

(5.5)#\[\begin{equation} C^T_{ij} = C_{ji} = \displaystyle \sum_k A_{jk} B_{ki} = \displaystyle \sum_k B_{ki} A_{jk} = \displaystyle \sum_k B^T_{ik} A^T_{kj} \end{equation}\]

By definition of matrix multiplication, this last expression is exactly \((B^T A^T)_{ij}\). Therefore:

(5.6)#\[\begin{equation} C^T_{ij} = (B^T A^T)_{ij} \end{equation}\]

Since this holds for all indices \(i\) and \(j\), we conclude that

(5.7)#\[\begin{equation} (AB)^T = B^T A^T. \end{equation}\]

Exercise Use NumPy to verify property 4 above (a justification by numerical experiment). Create two \(3\times3\) arrays \(A\) and \(B\). Use the numpy random integer function to create arrays with numbers between 0 and 10. Compute the product \(C = AB\). Compute the transpose of \(C\) by directly taking the transpose and also by computing \(B^T A^T\). Verify that both methods give the same result.

import numpy as np

# Set random seed for reproducibility
np.random.seed(42)

# Create two 3x3 arrays with random integers between 0 and 10
A = np.random.randint(0, 11, size=(3, 3))
B = np.random.randint(0, 11, size=(3, 3))

print("Matrix A:")
print(A)
print("\nMatrix B:")
print(B)

# Compute the product C = AB
C = A @ B  # or np.dot(A, B)
print("\nMatrix C = AB:")
print(C)

# Method 1: Direct transpose of C
C_transpose_direct = C.T
print("\nC^T (direct transpose):")
print(C_transpose_direct)

# Method 2: Compute B^T A^T
B_transpose = B.T
A_transpose = A.T
C_transpose_formula = B_transpose @ A_transpose

print("\nB^T:")
print(B_transpose)
print("\nA^T:")
print(A_transpose)
print("\nB^T A^T:")
print(C_transpose_formula)

# Verify that both methods give the same result
are_equal = np.allclose(C_transpose_direct, C_transpose_formula)
print(f"\nAre C^T and B^T A^T equal? {are_equal}")

# Show the difference (should be all zeros or very close to zero)
difference = C_transpose_direct - C_transpose_formula
print("\nDifference (should be all zeros):")
print(difference)

This exercise demonstrates property 4: \((AB)^T = B^T A^T\) through numerical computation. The np.allclose() function is used to check equality, which accounts for potential floating-point precision issues.

5.1.2. Symmetric Matrix#

A symmetric matrix is one in which the transpose is equal to itself. In other words, a matrix \(A\) is symmetric if:

(5.8)#\[\begin{equation} A = A^T \end{equation}\]

This means that \(A_{ij} = A_{ji}\) for all indices \(i\) and \(j\). In geometric terms, the matrix is “mirrored” across its main diagonal.

Examples:

A \(3 \times 3\) symmetric matrix:

(5.9)#\[\begin{equation} A = \begin{bmatrix} 2 & 5 & -1 \\ 5 & 3 & 7 \\ -1 & 7 & 4 \end{bmatrix} \end{equation}\]

Notice that \(A_{12} = A_{21} = 5\), \(A_{13} = A_{31} = -1\), and \(A_{23} = A_{32} = 7\).

Properties of Symmetric Matrices:

Only square matrices can be symmetric
The diagonal elements can be any values
All eigenvalues of a real symmetric matrix are real
Symmetric matrices are important in many applications (e.g., covariance matrices, quadratic forms)

Exercise Use NumPy to create a \(4 \times 4\) symmetric matrix and verify that it equals its own transpose. Also check a non-symmetric matrix to confirm the difference.

Solution

import numpy as np

# Set random seed for reproducibility
np.random.seed(123)

print("Creating a 4x4 Symmetric Matrix")
print("=" * 40)

# Method 1: Create a symmetric matrix by construction
# Generate a random matrix and make it symmetric
random_matrix = np.random.rand(4, 4)
# A matrix A is symmetric if A = (A + A^T) / 2
A_symmetric = (random_matrix + random_matrix.T) / 2

print("Symmetric matrix A:")
print(A_symmetric)

# Compute the transpose
A_transpose = A_symmetric.T
print("\nTranspose of A:")
print(A_transpose)

# Check if A = A^T
is_symmetric = np.allclose(A_symmetric, A_transpose)
print(f"\nIs A = A^T? {is_symmetric}")

# Show the difference (should be all zeros)
difference = A_symmetric - A_transpose
print("\nDifference A - A^T (should be all zeros):")
print(difference)
print(f"Maximum absolute difference: {np.max(np.abs(difference)):.2e}")

print("\n" + "=" * 60 + "\n")

print("Checking a Non-Symmetric Matrix")
print("=" * 35)

# Create a non-symmetric matrix for comparison
B_nonsymmetric = np.random.rand(4, 4)
print("Non-symmetric matrix B:")
print(B_nonsymmetric)

B_transpose = B_nonsymmetric.T
print("\nTranspose of B:")
print(B_transpose)

# Check if B = B^T
is_B_symmetric = np.allclose(B_nonsymmetric, B_transpose)
print(f"\nIs B = B^T? {is_B_symmetric}")

# Show the difference (should NOT be all zeros)
difference_B = B_nonsymmetric - B_transpose
print("\nDifference B - B^T:")
print(difference_B)
print(f"Maximum absolute difference: {np.max(np.abs(difference_B)):.4f}")

print("\n" + "=" * 60 + "\n")

print("Alternative: Manual Construction of Symmetric Matrix")
print("=" * 50)

# Method 2: Manually construct a symmetric matrix
C = np.array([[1, 2, 3, 4],
              [2, 5, 6, 7],
              [3, 6, 8, 9],
              [4, 7, 9, 10]])

print("Manually created symmetric matrix C:")
print(C)

# Verify symmetry
is_C_symmetric = np.allclose(C, C.T)
print(f"\nIs C = C^T? {is_C_symmetric}")

# NumPy also has a specific function to check symmetry
def is_symmetric_matrix(matrix):
    """Check if a matrix is symmetric"""
    return np.allclose(matrix, matrix.T)

print(f"\nUsing custom function - is C symmetric? {is_symmetric_matrix(C)}")

print("\nSUMMARY:")
print("- Matrix A (constructed): Symmetric ✓")
print("- Matrix B (random): Not symmetric ✗") 
print("- Matrix C (manual): Symmetric ✓")

5.1.3. Identity Matrix and Matrix Inverse#

The analog of the number 1 in matrix calculations is the identity matrix, a square matrix of all zeros except ones along the main diagonal. For example, the \(3\times3\) identity matrix is

(5.10)#\[\begin{align} \mathbf{I}_3 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} . \end{align} \]

In general, the \(n \times n\) identity matrix \(\mathbf{I}_n\) has the property that \(\mathbf{I}_n A = A \mathbf{I}_n = A\) for any \(n \times n\) matrix \(A\), just like multiplying a number by 1 leaves it unchanged.

Matrix Inverse

There is no matrix division operation (dividing by a matrix). However, there is the matrix inverse, defined by the relationship \(A^{-1}A = AA^{-1} = \mathbf{I}\), where \(A^{-1}\) is the inverse of \(A\). This is analogous to the notation for numbers, e.g., \(3^{-1} \cdot 3 = 1\).

Important notes about matrix inverses:

Only square matrices can have inverses
Not all square matrices have inverses (they must be non-singular or invertible)
If \(A^{-1}\) exists, it is unique
\((A^{-1})^{-1} = A\)
\((AB)^{-1} = B^{-1}A^{-1}\) (note the order reversal, similar to transpose)

Example

Consider the \(2 \times 2\) matrix

(5.11)#\[\begin{align} A = \begin{bmatrix} 2 & 1 \\ 1 & 1 \end{bmatrix}. \end{align}\]

Its inverse is

(5.12)#\[\begin{align} A^{-1} = \begin{bmatrix} 1 & -1 \\ -1 & 2 \end{bmatrix}. \end{align}\]

We can verify this by computing the product

(5.13)#\[\begin{align} A^{-1}A &= \begin{bmatrix} 1 & -1 \\ -1 & 2 \end{bmatrix} \begin{bmatrix} 2 & 1 \\ 1 & 1 \end{bmatrix} \\ &= \begin{bmatrix} 1(2) + (-1)(1) & 1(1) + (-1)(1) \\ (-1)(2) + 2(1) & (-1)(1) + 2(1) \end{bmatrix} \\ &= \begin{bmatrix} 2 - 1 & 1 - 1 \\ -2 + 2 & -1 + 2 \end{bmatrix} \\ &= \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = \mathbf{I}_2. \end{align}\]

This confirms that \(A^{-1}\) is the inverse of \(A\).

Exercise Use NumPy to create a \(3\times3\) matrix that is guaranteed to be invertible by starting with the identity matrix and adding small random values. Compute the inverse via np.linalg.inv(). Verify that multiplying the original matrix by the inverse results in the identity matrix.

Steps:

Create a \(3 \times 3\) identity matrix using np.eye(3)
Add small random values (e.g., 0.1 times a random matrix) to ensure invertibility
Compute the matrix inverse
Verify that \(A \cdot A^{-1} = I\) and \(A^{-1} \cdot A = I\)
Check that the results are close to the identity matrix using np.allclose()

Click here for solution

import numpy as np

# Set random seed for reproducibility
np.random.seed(42)

# Step 1: Create a 3x3 identity matrix
I = np.eye(3)
print("Identity matrix:")
print(I)

# Step 2: Add small random values to guarantee invertibility
# This ensures the matrix is close to identity but still interesting
random_perturbation = 0.1 * np.random.rand(3, 3)
A = I + random_perturbation

print("\nRandom perturbation (0.1 * random matrix):")
print(random_perturbation)

print("\nMatrix A = I + perturbation:")
print(A)

# Check that the matrix is indeed invertible
det_A = np.linalg.det(A)
print(f"\nDeterminant of A: {det_A:.6f}")
print("Since det(A) ≠ 0, the matrix is invertible.")

# Step 3: Compute the inverse
A_inv = np.linalg.inv(A)
print("\nInverse matrix A^(-1):")
print(A_inv)

# Step 4: Verify the inverse property
# Check A * A^(-1) = I
product1 = A @ A_inv
print("\nA @ A^(-1):")
print(product1)

# Check A^(-1) * A = I  
product2 = A_inv @ A
print("\nA^(-1) @ A:")
print(product2)

# Step 5: Verify results are close to identity
identity_3x3 = np.eye(3)

is_identity1 = np.allclose(product1, identity_3x3)
is_identity2 = np.allclose(product2, identity_3x3)

print(f"\nIs A @ A^(-1) ≈ I? {is_identity1}")
print(f"Is A^(-1) @ A ≈ I? {is_identity2}")

# Show the numerical errors (should be very small)
error1 = product1 - identity_3x3
error2 = product2 - identity_3x3

print("\nNumerical error in A @ A^(-1) - I:")
print(error1)
print(f"Maximum absolute error: {np.max(np.abs(error1)):.2e}")

print("\nNumerical error in A^(-1) @ A - I:")
print(error2)
print(f"Maximum absolute error: {np.max(np.abs(error2)):.2e}")

print("\n" + "="*60)
print("SUCCESS: The matrix inverse properties are verified!")
print("Both A @ A^(-1) and A^(-1) @ A equal the identity matrix")
print("(within numerical precision)")

This method guarantees an invertible matrix because we start with the identity matrix (which has determinant 1) and add only small perturbations that won’t make the determinant zero.

5.1.4. Norms#

The distance between two points is a well-known example of a vector norm; it is called the Euclidean or \(L_2\) norm. For example, in 3-dimensional space, consider two points \((x_1, y_1, z_1)\) and \((x_2, y_2, z_2)\). The Euclidean distance between them is

(5.14)#\[\begin{equation} \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2 + (z_1 - z_2)^2} . \end{equation} \]

A norm is a measure of the size (or magnitude) of a vector. There are multiple methods to calculate a norm. The p-norm of a vector \(\mathbf{x} = [x_1, x_2, \ldots, x_n]^T\) is defined by

(5.15)#\[\begin{equation} \|\mathbf{x}\|_p = \left( \sum_{i=1}^n | x_i |^p \right)^{1/p}. \end{equation} \]

Common Vector Norms

\(L_1\) norm (\(p=1\)): \(\|\mathbf{x}\|_1 = \sum_{i=1}^n |x_i|\) (Manhattan distance)
\(L_2\) norm (\(p=2\)): \(\|\mathbf{x}\|_2 = \sqrt{\sum_{i=1}^n x_i^2}\) (Euclidean norm)
\(L_\infty\) norm (\(p=\infty\)): \(\|\mathbf{x}\|_\infty = \max_i |x_i|\) (Maximum norm)

Definition of a Norm

A norm on a vector space \(V\) is a function \(\| \cdot \| : V \rightarrow \mathbb{R}\) satisfying:

Positive definiteness: \(\| \mathbf{x} \| > 0\) if \(\mathbf{x} \neq \mathbf{0}\), and \(\| \mathbf{0} \| = 0\)
Homogeneity: \(\|\lambda \mathbf{x} \| = |\lambda| \| \mathbf{x} \|\) for all \(\lambda \in \mathbb{R}\) and \(\mathbf{x} \in V\)
Triangle inequality: \(\| \mathbf{x} + \mathbf{y}\| \leq \|\mathbf{x}\| + \|\mathbf{y}\|\) for all \(\mathbf{x}, \mathbf{y} \in V\)

Exercise Verify that if \(p=2\) in the p-norm formula, you recover the Euclidean norm. Show this both algebraically and with a numerical example using a 3D vector.

Algebraic Solution

Starting with the p-norm formula for \(p=2\):

(5.16)#\[\begin{align} \|\mathbf{x}\|_2 &= \left( \sum_{i=1}^n | x_i |^2 \right)^{1/2} \\ &= \left( \sum_{i=1}^n x_i^2 \right)^{1/2} \quad \text{(since } |x_i|^2 = x_i^2 \text{)} \\ &= \sqrt{\sum_{i=1}^n x_i^2} \end{align}\]

This is exactly the definition of the Euclidean norm.

For the distance between two points, if we let \(\mathbf{d} = \mathbf{x}_1 - \mathbf{x}_2\) be the difference vector, then:

(5.17)#\[\begin{align} \|\mathbf{d}\|_2 &= \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2 + (z_1 - z_2)^2} \end{align}\]

This confirms that the \(L_2\) norm (p-norm with \(p=2\)) gives us the familiar Euclidean distance formula.

Numerical Example with Python

import numpy as np

# Define a 3D vector
x = np.array([3, 4, 5])

# Method 1: Using p-norm formula with p=2
p = 2
norm_p = (np.sum(np.abs(x)**p))**(1/p)

# Method 2: Direct Euclidean norm calculation
norm_euclidean = np.sqrt(np.sum(x**2))

# Method 3: NumPy's built-in L2 norm
norm_numpy = np.linalg.norm(x, ord=2)

print(f"Vector x = {x}")
print(f"P-norm with p=2: {norm_p:.6f}")
print(f"Euclidean norm:   {norm_euclidean:.6f}")
print(f"NumPy L2 norm:    {norm_numpy:.6f}")
print(f"All methods equal? {np.allclose([norm_p, norm_euclidean, norm_numpy], norm_numpy)}")

# Verification: 3² + 4² + 5² = 9 + 16 + 25 = 50, √50 ≈ 7.071
expected = np.sqrt(3**2 + 4**2 + 5**2)
print(f"Expected result:  {expected:.6f}")

Expected Output:

Vector x = [3 4 5]
P-norm with p=2: 7.071068
Euclidean norm:   7.071068
NumPy L2 norm:    7.071068
All methods equal? True
Expected result:  7.071068

This confirms that the p-norm with \(p=2\) is indeed the Euclidean norm.