3. Data Lab 2: Python data structures exercises
If you had succesfully completed the workshop assignments yesterday, you should now have:
- Your Python IDE set-up (Anaconda and Jupyter Notebooks)
- Familiarity with Python syntax
- A basic understanding of variable assignment, conditionals, loops and functions in Python.
- Familiarity with Python data types and data structures
- Familiarity with Python
pandas - Familiarity with Python
numpy
In today's data lab, you will merge these concepts and solve the following use-cases using Python.
1. RVIM Vaccine Registration
The following table indicates the vaccination schedule for people who are not in medically at-risk or high-risk groups, who live at home and can access the vaccination location on their own, and who do not work in healthcare
| Year of Birth | Vaccine registration | Vaccine | Location |
|---|---|---|---|
| 1931 or earlier | From 25 January 2021 | BioNTech | Groningen |
| 1932 - 1936 | From 29 January 2021 | Pfizer | Arnhem |
| 1937 - 1941 | From 5 February 2021 | Pfizer | Breda |
| 1942 - 1946 | From 6 March 2021 | Moderna | Harlingen |
| 1947 - 1951 | From 6 April 2021 | Moderna | Edam |
| 1952 - 1955 | From 15 April 2021 | AstraZeneca | Amsterdam |
| 1956 - 1957 | From 15 February 2021 | AstraZeneca | Sittard |
| 1958 - 1960 | From 15 April 2021 | Moderna | Rotterdam |
| 1961 - 1971 | From 27 April 2021 | Janssen | Groningen |
| 1972 - 1981 | From early June 2021 | Janssen | Arnhem |
| 1982 - 1991 | From mid-June 2021 | Moderna | Breda |
| 1992 or later | From mid-June 2021 | BioNTech | Maastricht |
Using Jupyter notebooks, write a Python function which prints out the vaccine registration date, vaccine and location for a user supplied year of birth. Use a Pandas dataframe to store and retrieve data.
2. Titanic data investigation using Pandas.
In this exercise, you will use Python pandas to investigate the Titanic dataset to answer the following questions:
- Which class had a higher chance of surviving the disaster?
- Which gender has a higher change of surviving the disaster?
- Group the data into age groups (young, adult, old) and find which age group had the highest chance of surviving?
Note: Use https://www.kaggle.com/c/titanic/data to download and explore the dataset.
Only read in train.csv
3. Pythoshop:
- Create a numpy array heart with the following values as shown below:
import numpy as np
import matplotlib.pyplot as plt
heart_img = np.array([[255,0,0,255,0,0,255],
[0,255/2,255/2,0,255/2,255/2,0],
[0,255/2,255/2,255/2,255/2,255/2,0],
[0,255/2,255/2,255/2,255/2,255/2,0],
[255,0,255/2,255/2,255/2,0,255],
[255,255,0,255/2,0,255,255],
[255,255,255,0,255,255,255]])
- visualize the array using the helper function
def show_image(image, img_title):
plt.imshow(image, cmap="gray")
plt.title(img_title)
plt.show()
Can you figure out how the NumPy array heart_img creates that image you see? Take some time to think.
- subtract 255 from each value and observe what happens to the image when you plot it
- save the modified array as a new numpy array broken_heart
- display the modified array along with the original array.
4. Pixel art & NFTs!
Create your own 10X10 greyscale pixel art using numpy arrays. For an additional challenged, try to create an animation.
An animation (gif) is just a sequence of image frames moving in rapid succesion.
If you are now curious as to how a computer represents colour images, please watch the following video
We will cover this topic in more detail on Monday! Have a nice weekend :)
Bonus assignment:
If you finish early, please watch the following video: