3. Data Lab 2: Python data structures exercises

If you had succesfully completed the workshop assignments yesterday, you should now have:

  • Your Python IDE set-up (Anaconda and Jupyter Notebooks)
  • Familiarity with Python syntax
  • A basic understanding of variable assignment, conditionals, loops and functions in Python.
  • Familiarity with Python data types and data structures
  • Familiarity with Python pandas
  • Familiarity with Python numpy

In today's data lab, you will merge these concepts and solve the following use-cases using Python.

1. RVIM Vaccine Registration

The following table indicates the vaccination schedule for people who are not in medically at-risk or high-risk groups, who live at home and can access the vaccination location on their own, and who do not work in healthcare

Year of Birth Vaccine registration Vaccine Location
1931 or earlier From 25 January 2021 BioNTech Groningen
1932 - 1936 From 29 January 2021 Pfizer Arnhem
1937 - 1941 From 5 February 2021 Pfizer Breda
1942 - 1946 From 6 March 2021 Moderna Harlingen
1947 - 1951 From 6 April 2021 Moderna Edam
1952 - 1955 From 15 April 2021 AstraZeneca Amsterdam
1956 - 1957 From 15 February 2021 AstraZeneca Sittard
1958 - 1960 From 15 April 2021 Moderna Rotterdam
1961 - 1971 From 27 April 2021 Janssen Groningen
1972 - 1981 From early June 2021 Janssen Arnhem
1982 - 1991 From mid-June 2021 Moderna Breda
1992 or later From mid-June 2021 BioNTech Maastricht

Using Jupyter notebooks, write a Python function which prints out the vaccine registration date, vaccine and location for a user supplied year of birth. Use a Pandas dataframe to store and retrieve data.

2. Titanic data investigation using Pandas.

In this exercise, you will use Python pandas to investigate the Titanic dataset to answer the following questions:

  1. Which class had a higher chance of surviving the disaster?
  2. Which gender has a higher change of surviving the disaster?
  3. Group the data into age groups (young, adult, old) and find which age group had the highest chance of surviving?

Note: Use https://www.kaggle.com/c/titanic/data to download and explore the dataset.

Only read in train.csv

3. Pythoshop:

  • Create a numpy array heart with the following values as shown below:
import numpy as np
import matplotlib.pyplot as plt

heart_img = np.array([[255,0,0,255,0,0,255],
              [0,255/2,255/2,0,255/2,255/2,0],
          [0,255/2,255/2,255/2,255/2,255/2,0],
          [0,255/2,255/2,255/2,255/2,255/2,0],
              [255,0,255/2,255/2,255/2,0,255],
                  [255,255,0,255/2,0,255,255],
                  [255,255,255,0,255,255,255]])
  • visualize the array using the helper function
def show_image(image, img_title):
  plt.imshow(image, cmap="gray")
  plt.title(img_title)
  plt.show()

Can you figure out how the NumPy array heart_img creates that image you see? Take some time to think.

  • subtract 255 from each value and observe what happens to the image when you plot it
  • save the modified array as a new numpy array broken_heart
  • display the modified array along with the original array.

4. Pixel art & NFTs!

Create your own 10X10 greyscale pixel art using numpy arrays. For an additional challenged, try to create an animation.

An animation (gif) is just a sequence of image frames moving in rapid succesion.

If you are now curious as to how a computer represents colour images, please watch the following video

We will cover this topic in more detail on Monday! Have a nice weekend :)

Bonus assignment:

If you finish early, please watch the following video: