Matrix Multiplication

3.4. Matrix Multiplication#

draft version - do not distribute

Consider two matrices \( A \) and \( B \):

(3.13)#\[\begin{equation} A = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}, \quad B = \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} \end{equation} \]

The product \(C = A B \) is given by:

(3.14)#\[\begin{equation} C = \begin{bmatrix} c_{11} & c_{12} \\ c_{21} & c_{22} \end{bmatrix} = \begin{bmatrix} a_{11}b_{11} + a_{12}b_{21} & a_{11}b_{12} + a_{12}b_{22} \\ a_{21}b_{11} + a_{22}b_{21} & a_{21}b_{12} + a_{22}b_{22} \end{bmatrix} \end{equation} \]

Each element of \(C\) is

(3.15)#\[\begin{equation} c_{ij} = \sum_{k=1}^2 a_{ik} b_{kj} \tag{1.1} \end{equation} \]

3.4.1. Zero-Based Indexing#

Python is zero based indexing/numbering. Therefore we are inclined to write our matrix elements and summations in zero-based indexing format.

(3.16)#\[\begin{equation} A = \begin{bmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{bmatrix}, \quad B = \begin{bmatrix} b_{00} & b_{01} \\ b_{10} & b_{11} \end{bmatrix} \end{equation} \]

The product \(C = A B \) is given by:

(3.17)#\[\begin{equation} C = \begin{bmatrix} c_{00} & c_{01} \\ c_{10} & c_{11} \end{bmatrix} = \begin{bmatrix} a_{00}b_{00} + a_{01}b_{10} & a_{00}b_{01} + a_{01}b_{11} \\ a_{01}b_{00} + a_{11}b_{10} & a_{10}b_{01} + a_{11}b_{11} \end{bmatrix} \end{equation} \]

Each element of \(C\) is

(3.18)#\[\begin{equation} c_{ij} = \displaystyle \sum_{k=0}^1 a_{ik} b_{kj} \tag{1.2} \end{equation} \]

3.4.2. problem#

Consdier the two matrices

(3.19)#\[\begin{equation} A = \begin{bmatrix} 1 & 3 \\ 2 & -8 \end{bmatrix}, \quad B = \begin{bmatrix} 6 & 10 \\ 4 & -3 \end{bmatrix} . \end{equation} \]

Using nested loops, code matrix multiplication of \(A\) and \(B\) above from scratch. Compare your result to NumPy. Before you begin to code, write out a psuedocode (always do this).

The key equation in this exercise is eqn. 1.1. \(c_{ij} = \sum_{k=0}^1 a_{ik} b_{kj} \tag{1.2}\). The steps are:

initialize the matrix \(C\) as an array of zeros, with the intentions of assigning values to each element through a loop
create an outer-loop over the rows of \(C\), indexed by i
create an inner-loop over the columns of \(C\), indexed by j
create another inner-loop, indexed by k, that computes \(c_{ij} = \sum_{k=0}^1 a_{ik} b_{kj} \tag{1.2}\), which we can write in a more Pythonic way as \begin{equation} C[i, j] = \sum_{k=0}^{1} A[i, k] * B[k,j] \end{equation}

import numpy as np

A = np.array([[1, 3], [2, -8]]) 
B = np.array([[6, 10], [4, -3]]) 
print(A) 
print(B) 

[[ 1  3]
 [ 2 -8]]
[[ 6 10]
 [ 4 -3]]

n=2
C=np.zeros((n,n))

for i in range(n): 
    for j in range(n): 
        for k in range(n): 

            C[i,j]+=A[i,k]*B[k,j]

# results from direct loop calculation 
C

array([[ 18.,   1.],
       [-20.,  44.]])

Compare your results to NumPy. NumPy has two was to multiply matrices, np.dot, np.matmul. Additionally you can use the @ symbol.

np.dot(A,B) 

array([[ 18,   1],
       [-20,  44]])

np.matmul(A,B) 

array([[ 18,   1],
       [-20,  44]])

A@B

array([[ 18,   1],
       [-20,  44]])

3.4.3. Multiplication in General#

A good way to view matrix multiplication is as taking the dot product of the rows of one matrix with the columns of another vector.

A general \(m \times n\) matrix \( A \) can be expressed as a collection of its row vectors:

(3.20)#\[\begin{equation} A = \begin{bmatrix} \mathbf{r}_0 \\ \mathbf{r}_1 \\ \vdots \\ \mathbf{r}_{m-1} \end{bmatrix} \end{equation}\]

where each row vector \(\mathbf{r}_i\) is given by:

(3.21)#\[\begin{equation} \mathbf{r}_i = \begin{bmatrix} a_{i0} & a_{i1} & \dots & a_{i(n-1)} \end{bmatrix} \end{equation} \]

For example, a \( 3 \times 3 \) matrix using zero-based indexing can be written as:

(3.22)#\[\begin{equation} A = \begin{bmatrix} \mathbf{r}_0 \\ \mathbf{r}_1 \\ \mathbf{r}_2 \end{bmatrix} = \begin{bmatrix} a_{00} & a_{01} & a_{02} \\ a_{10} & a_{11} & a_{12} \\ a_{20} & a_{21} & a_{22} \end{bmatrix} \end{equation} \]

Likewise a general \(m \times n\) matrix \( A \) can be expressed as a collection of its column vectors:

(3.23)#\[\begin{matrix} B = \begin{bmatrix} \mathbf{c}_0 & \mathbf{c}_1 & \dots & \mathbf{c}_{n-1} \end{bmatrix} \end{matrix} \]

where each column vector \(\mathbf{c}_j \) is given by:

(3.24)#\[\begin{equation} \mathbf{c}_j = \begin{bmatrix} b_{0j} \\ b_{1j} \\ \vdots \\ b_{(m-1)j} \end{bmatrix} \end{equation} \]

For example, a \(3 \times 3 \) matrix using zero-based indexing can be written as:

(3.25)#\[\begin{equation} B = \begin{bmatrix} \mathbf{c}_0 & \mathbf{c}_1 & \mathbf{c}_2 \end{bmatrix} = \begin{bmatrix} b_{00} & b_{01} & b_{02} \\ b_{10} & b_{11} & b_{12} \\ b_{20} & b_{21} & b_{22} \end{bmatrix} \end{equation}\]

In terms of rows and columns the matrix product \(C=AB\) has matrix elements which are the dot product of the rows of \(A\) with the columns of \(B\).

(3.26)#\[\begin{equation} C_{ij}= \mathbf{r}_i \cdot \mathbf{c}_j \end{equation} \]

This approach helps us to understand that the number of rows of matrix \(A\) must equal the number of columns of matrix \(B\) for matrix multiplication to make sense, otherwise, you would not be able to take the dot product of the rows and columns.

To multiply a \(m\times p\) matrix \(A\) with a \(q\times n\) matrix \(B\), \(p\) must be equal to \(q\). The result is a \(m \times n\) matrix.

3.4.4. problem#

With the code you just wrote, write function called mat_mul, that returns the matrix multiplication of two matrices. Your function will look something like this.

def mat_mul(A,B): 
    '''paramters: 
        A, an array
        B, an array 
       returns: 
       the matrix product of A and B
    ''' 

Make sure your function raises an error if matrix multiplication is not possible, with an explanation of the error. Test your function on randomly generated data, using a suitable random function from NumPy. Compare your code to NumPy matrix multiplication. One way to get randomly generated data is to use the np.random.rand function, which generates pseudo-random numbers between 0 and 1. Here is a code snippet to obtain a random \(3×4\) matrix.

A=np.random.rand(3,4)

The purpose of this exercise is to make sure you know how matrix multiplication works. You should not use your from-scratch code for matrix multiplication, that would be highly inefficient.

def mat_mul(A,B): 
    '''paramters: 
        A, an array
        B, an array 
       returns: 
       the matrix product of A and B
    ''' 
    m,p = A.shape
    q,n = B.shape 

    # check that number of columns of A equal number of rows of B
    if p != q:
        raise ValueError("The number of rows of A does not equal the number of columns of B.")

    C=np.zeros((m,n))

    for i in range(m):
        for j in range(n): 
            for k in range(p): 
                C[i,j]+=A[i,k]*B[k,j] 
    return C