Final Assignment: Machine Learning Analysis
Machine learning in this course is assessed by your machine learning analysis on the Oosterhout dataset. During the machine learning modules we will cover several analytical techniques and your are tasked to apply the most suitable technique to answer your research question.
Pre-requisites
By the time we start exploring machine learning you should have:
- created a research questions based on the problem statement;
- a preprocessed of the Oosterhout dataset in powerBI;
- a .csv file extracted from the Oosterhout dataset relevant to your research question.
And finally, when we finished you should have completed module 1 tot 10 of the codecademy course "Basics of Machine Learning".
Modeling
- For every machine learning algorithm we learn we will make a suitable model on the Oosterhout dataset!
- The structure of the model and it's outcome and predictor variables will be informed by you problem statement and/or research question.
- You will then create your fully fitted model.
- Subsequently: test, re-fit and validate your model. Create a new model on a new line for every re-fit. Keep track of any predictor variables you exclude from the full model when re-fitting. Motivate why you are excluding; or including new variables using in-line comments.
- Choose the best model regarding your research problem and research question, motivate your answer and submit your evidence; so the filled-in template script below, to github and refer to that script in your learning- and work-log.
Deliverable
Download the Juypter Notebook Templatefor delivering your models here, if you haven't done so already. To download the script, open the raw file, right-click and ‘save as' into a location of choice. Make sure to rename the file appropriately.
The Jupyter notebooks or .py scripts are to be uploaded to Github no later than 5pm on last data lab day.
Assessment Criteria
See the self-assessment in microsoft teams for the assessment criteria.