Figure 1.
The example script I provided in Data Science 1 is also a good example of how to document your code; albeit that one was done in R.
- Open your python file (MachineLearning_OosterhoutModels_…) used for the final delivery of your model.
- Load in the youthcare dataset you created in Business Intelligence if you haven't done so already. Load in any other data you might need. Then save your file to your GitHub repository.
- Open your research design and use in-line comments to formulate a classification analysis using decision trees based on your research question (or when not answerable using this type of analysis: perform an analysis related to your research question). Start by listing the variables which you think could predict the outcome variable you're interested in and motivate why you think they might predict your outcome variable.
- Create your fully fitted model (so the model containing all variables you wrote down in step 2) under the python code you just wrote.
- Test, re-fit and validate your model. Create a new model on a new line for every re-fit. Keep track of any predictor variables you exclude from the full model when re-fitting. Motivate why you are excluding; or including new variables using in-line comments.
- Continue till 16:00, or stop when you feel you can no longer improve the model. Then save your file to your GitHub repository.
3) Random Forests
When you have completed with your analyses on the Oosterhout data, please open the Basics of Machine Learning course on Codecademy and complete the module Random Forests.
4) Day-Reflection
At 16:00, there's a meeting you're encouraged to take part in to ask questions and to discuss our progress and reflect on today activities.
Resources
2023,Applied Data Science and Artificial Intelligence @ Breda University of Applied Sciences