2.3. Psuedo Code Exercise#

Writing code from psuedo code is very important, it will be one of the major themes of this course. In your career you might read a research paper with an algorithm in psuedo code, you need to have the ability to write that algoritm into code. Or, perhaps, you need to explain an algorithm to a group of people who do not all code in the same programing language. You will need your explanation in psuedo code. More than any other reason, writing code from psuedo code forces you to think about how the algorithm works, hence it will boost your intution for the method.

In this note, instead of focusing on a traditional machine learning algorithm, we will look at what is called a random walk. The benefit to studying the the random walk is vast, a few reasons are: 1) it is simple and does not need much math, and it has much applicaiton to real world phenomena.

2.3.1. Random Walk#

If you are unfamiliar with some of the Python functionality needed, for example random numbers and arrays, then jump ahead to those relevant chapters.

In this exercise we will look at a two-dimensional random walk. A random walk is a mathematical model that describes a path consisting of a succession of random steps. In a 2D random walk, a particle starts at the origin and at each time step moves one unit in a randomly chosen direction (up, down, left, or right).

Random walks have applications in diffusion modeling, for example in physics (Brownian motion), biology (animal foraging patterns), and financial modeling (stock price movements). They help us understand how particles spread out over time and model various stochastic processes in nature.

2.3.1.1. NumPy’s Random Choice#

We will need to generate steps of unit length. To do so, we can use the random choice function in NumPy. The random choice function chooses a random element from a multi-element data structure, for example, a list or an array. Here is an example of randomly choosing between -1 and 1 (representing steps of unit length in either direction):

steps = [-1, 1]
result = np.random.choice(steps)
print(result)

Each time you call random choice, you will get a different result, unless you have fixed the random seed. You could observe this by using a loop:

for i in range(10):
    result = np.random.choice(steps)
    print(result)

By default, the random choice function assigns equal probability to each outcome. You can weight each element by a probability. For example, let us say that we want 65% of the outcomes to be 1 and 35% to be -1. Here is code to do that:

steps = [-1, 1]
probability = [0.35, 0.65]
result = np.random.choice(steps, p=probability)
print(result)

Exercise

Write code to compute 10,000 outcomes of random choices between -1 and 1, weighted with respective probabilities of 30% and 70%. Collect the results into an array of size 10,000. Count the number of occurrences of -1 and 1 and find the experimental fractions for each occurrence, i.e., the number of occurrences divided by the total number of samples. Your results should be close to 0.3 and 0.7.


Below is pseudocode for a two-dimensional random walk. Using the pseudocode, write a Python function to generate random walk data.

ALGORITHM: Two-Dimensional Random Walk
INPUT: n_steps (number of steps to take)
OUTPUT: array of positions with shape (n_steps, 2)

// Initialize starting position and storage array
current_position = [0, 0]
positions = array of size (n_steps, 2)
positions[0] = current_position

// Take n_steps-1 additional steps
FOR i = 1 to n_steps-1
    // Generate random step in x and y directions
    step_x = random choice from [-1, 1]
    step_y = random choice from [-1, 1]
    
    // Update current position
    current_position[0] = current_position[0] + step_x
    current_position[1] = current_position[1] + step_y
    
    // Store the new position
    positions[i] = current_position
END FOR

RETURN positions

Part 1: Implement this pseudocode in Python using NumPy. Your function should take n_steps as input and return a NumPy array of shape (n_steps, 2) containing the path of the random walk.

Part 2: Once you have a working function, use it to generate multiple random walks (an ensemble of random walks) and observe the diffusion process. Create a function that generates n_walks different random walks, each with n_steps steps. Let us call the output of this function walks (an n_walks by n_steps by 2 NumPy array).



import numpy as np
import matplotlib.pyplot as plt
import matplotlib

from matplotlib import animation
matplotlib.rcParams['animation.writer'] = 'pillow'
import seaborn as sns
from IPython.display import HTML

# external script (not included in published notes)
from random_walk_script import multiple_random_walks

Exercise With your code generate walks for 1000 walkers for 1000 steps.

n_steps=1000
n_walkers=1000

walks = multiple_random_walks(n_steps, n_walkers)
walks.shape
(1000, 1000, 2)

Single Walker Plot

With your working code, plot the walk of a single walker. Here is a code.

i_walker = 10

fig = plt.figure(figsize=(4,4))
ax=fig.add_subplot()
ax.plot(walks[i_walker,:,0], walks[i_walker,:,1], "-o", markersize=1, color="black") 
ax.plot(walks[i_walker,0,0], walks[i_walker,0,1], "o", markersize=2, color="red") 
[<matplotlib.lines.Line2D at 0x7f2ca135a890>]
../_images/e047af74ff12112e930a1a0f0514f99d1b1d3017a78fd38b8e10617baf0819f0.png

Animation With your working code, you can use the animation script below to create an animation of the random walk output. Note: there are some visual artifacts, related to the fact that the steps are on a grid in increments of 1.

L = np.max(walks)
fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot()
ax.set_xlim(-L,L)
ax.set_ylim(-L,L)
pts, = ax.plot([], [], "o", markersize=1, color = "black", alpha = .5) 
plt.close()

def init():
    pts.set_data([], []) 
    return pts, 

def draw(i): 
    x = walks[:,i,0]
    y = walks[:,i,1]
    pts.set_data(x,y)
    return pts, 

anim = animation.FuncAnimation(fig, draw, frames=walks.shape[1], interval=50, blit=True)

# Save the animation as a GIF
anim.save('random_walk_animation.gif', writer='pillow', fps=20, dpi=80)
    
# # Display the saved GIF
# from IPython.display import Image
# Image('random_walk_animation.gif')
    
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
/tmp/ipykernel_2200/575277204.py in <cell line: 0>()
     20 
     21 # Save the animation as a GIF
---> 22 anim.save('random_walk_animation.gif', writer='pillow', fps=20, dpi=80)
     23 
     24 # # Display the saved GIF

/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/matplotlib/animation.py in save(self, filename, writer, fps, dpi, codec, bitrate, extra_args, metadata, extra_anim, savefig_kwargs, progress_callback)
   1096         # callback a no-op; canvas.manager = None prevents resizing the GUI
   1097         # widget (both are likewise done in savefig()).
-> 1098         with (writer.saving(self._fig, filename, dpi),
   1099               cbook._setattr_cm(self._fig.canvas, _is_saving=True, manager=None)):
   1100             if not writer._supports_transparency():

/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/contextlib.py in __exit__(self, typ, value, traceback)
    142         if typ is None:
    143             try:
--> 144                 next(self.gen)
    145             except StopIteration:
    146                 return False

/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/matplotlib/animation.py in saving(self, fig, outfile, dpi, *args, **kwargs)
    224                 yield self
    225             finally:
--> 226                 self.finish()
    227 
    228 

/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/matplotlib/animation.py in finish(self)
    504 
    505     def finish(self):
--> 506         self._frames[0].save(
    507             self.outfile, save_all=True, append_images=self._frames[1:],
    508             duration=int(1000 / self.fps), loop=0)

/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/PIL/Image.py in save(self, fp, format, **params)
   2569 
   2570         try:
-> 2571             save_handler(self, fp, filename)
   2572         except Exception:
   2573             if open_fp:

/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/PIL/GifImagePlugin.py in _save_all(im, fp, filename)
    790 
    791 def _save_all(im: Image.Image, fp: IO[bytes], filename: str | bytes) -> None:
--> 792     _save(im, fp, filename, save_all=True)
    793 
    794 

/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/PIL/GifImagePlugin.py in _save(im, fp, filename, save_all)
    803         im.encoderinfo.setdefault("optimize", True)
    804 
--> 805     if not save_all or not _write_multiple_frames(im, fp, palette):
    806         _write_single_frame(im, fp, palette)
    807 

/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/PIL/GifImagePlugin.py in _write_multiple_frames(im, fp, palette)
    695             if im_frames and previous_im:
    696                 # delta frame
--> 697                 delta, bbox = _getbbox(previous_im, im_frame)
    698                 if not bbox:
    699                     # This frame is identical to the previous frame

/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/PIL/GifImagePlugin.py in _getbbox(base_im, im_frame)
    650         base_im = base_im.convert("RGBA")
    651     delta = ImageChops.subtract_modulo(im_frame, base_im)
--> 652     return delta, delta.getbbox(alpha_only=False)
    653 
    654 

/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/PIL/Image.py in getbbox(self, alpha_only)
   1397 
   1398         self.load()
-> 1399         return self.im.getbbox(alpha_only)
   1400 
   1401     def getcolors(

KeyboardInterrupt: 

matplotlib.rcParams['animation.writer'] = 'ffmpeg' 
matplotlib.rcParams['animation.codec'] = 'h264'

#HTML(anim.to_html5_video())
HTML(anim.to_jshtml())

Final Position Analysis

If you plot the histogram of the final positions in each direction, you will observe a normal distribution (bell curve). This is an example of the Central Limit Theorem. Notes on the Central Limit Theorem and random walks are here. Increase the number of walkers, and you will see a distribution more like the normal distribution.

Complexity Explorer has video on the central limit theorem and random walks here.

Below is code to create these plots using the kernel density estimation functionality in Seaborn.

# Extract final positions
final_x_positions = walks[:, -1, 0]  # Final x-coordinates
final_y_positions = walks[:, -1, 1]  # Final y-coordinates

# Create combined histogram and KDE plots
fig = plt.figure(figsize=(14, 4))

# X positions: histogram + KDE
ax1 = fig.add_subplot(1, 2, 1)
ax1.hist(final_x_positions, bins=30, edgecolor="black", alpha=0.6, density=True, label='Histogram')
sns.kdeplot(final_x_positions, fill=False, linewidth=2, label='KDE', ax=ax1)
ax1.set_title('Final X Positions', fontsize=20)
ax1.set_xlabel('X Position', fontsize=20)
ax1.set_ylabel('Density', fontsize=20)
ax1.legend()

# Y positions: histogram + KDE
ax2 = fig.add_subplot(1, 2, 2)
ax2.hist(final_y_positions, bins=30, edgecolor="black", alpha=0.6, density=True, label='Histogram')
sns.kdeplot(final_y_positions, fill=False, linewidth=2, label='KDE', ax=ax2)
ax2.set_title('Final Y Positions', fontsize=20)
ax2.set_xlabel('Y Position', fontsize=20)
ax2.set_ylabel('Density', fontsize=20)
ax2.legend()

plt.tight_layout()
plt.show()

# 2D KDE plot
plt.figure(figsize=(8, 6))
sns.kdeplot(x=final_x_positions, y=final_y_positions, fill=True, alpha=0.7)
plt.title('2D KDE of Final Positions', fontsize=20)
plt.xlabel('Final X Position', fontsize=20)
plt.ylabel('Final Y Position', fontsize=20)
plt.axis('equal')
plt.show()
../_images/77cfcc4762ffb30b27f0c2a98157faa64277735c0f0d30f925f68deebf44b6d1.png ../_images/e3f3df82bcb5afded8b885280720fca3d87aee9e933a0bf53b899c1b1531e328.png

Things to Explore

Here are some things to explore if you are curious:

  • Update your code so that you can set unequal probabilities for steps in the x and y directions.

  • Instead of unit length steps, sample from a normal distribution to create a Gaussian random walk.