Datalab 03: Discussion
This data-lab day will focus on describing and understanding the data on a deeper level: interpreting the data distribution and the summary statics in the context of our research question. Then we will propose an analysis to answer that research question.
Learning objectives:
- understanding and interpreting real-world descriptive Datalab;
- judge which analysis is best suited to address a research challenge.
Table of contents:
- Mock Assessment Recap: 1 hour
- Project: 7 hours (but probably much shorter; 8 hours is the allocated time)
Good luck!
0) Mock Assessment recap
Now, let's all find a partner and get our mock-assessments out and discuss them. Specifically, analyse and evaluate one another's answers to the questions. If you both have the same answers, you're done. If not, then try to see who is right by explaining it to one another. If you both don't have an answer on a particular question; ask your instructor(s) in Datalab.
1) Feedback on poster
Now look at the feedback you have received on your poster by navigating to your GitHub repository and clicking on the ‘Pull requests' header. There should be a feedback thread over there. Process the feedback and then continue as described below.
2) Considerations in interpreting the data
Now, it's time to put all the knowledge we gained into practice by writing the discussion section of our Conference Poster!
Open your conference poster and inspect the Discussion section. Next, open R-studio and open your script.
We start with considering the distribution of the data, the summary statistics and data visualisations. What do they tell us about themselves? How do they relate to the research question?
- Give an unbiased report of the possible considerations (e.g. Missing data, distribution, collection concerns, domain problems etc.) in interpreting the findings of the EDA.
- Check if the provided considerations need to be supplemented with domain knowledge.
3) Recommendations for future analyses & data-driven decisions:
Now that we have interpreted the data distribution and summary statistics, it's time to consider how we could use our data to answer our research question.
- Which analysis do you propose to run on the data? Form an unbiased judgement on the considerations. You can suggest multiple analyses as well if you can supplement your recommendation.
4) In-Class discussion
At 16:00, we'll all get together in Datalab to discuss our progress and reflect on today activities.
Next week, we will start diving into reporting and data visualisation practices.
Questions or issues?
If you have any questions, please first ask your peers or send us a message on teams instead!