Datalab 04: Conclusion and Referencing

We've now learned everything we needed to know in order to finish our first data science project. All that's left is writing the conclusion and referencing any sources you may have used, like the webpage of the SDG indicators where you got your data. Once you're finished, you can use the remainder of your time to brush up on your conference poster or revisit part of the analysis like data visualisation or interpretation.

Learning Objectives

  1. write a conclusion;
  2. reference your sources;
  3. understand the CRISP-DM.

Please follow the links below to continue the class:

  1. Mock Assessment Recap: 0.5 hours
  2. Feedback on Poster: 0.5 hours
  3. Conclusion: 2 hours
  4. Referencing: 1 hour
  5. CRISP-DM: 1 hour; potentially more depending on your project

Feel free to ask for personal feedback during Datalab: from 9:00 to 16:00, in person or online.

Good luck!

0) Mock Assessment Recap

Now, let's all find a partner and open your pre-mock assessment poster and our post-mock asessement poster respectively. Show them to your partner and explain the improvements you made, why you made them and how you made them. You analyse and evaluate one another's work: specifically the choices, the execution and the story they tell with their data visualisations.

1) Feedback on poster

Now look at the feedback you have received on your Learning Log by navigating to the week 3/4/5 tab in your Microsoft Teams. There should be a feedback thread over there. Process the feedback if you have some and then watch the video below and see if you can iterate on your poster further before continueing writing the conclusion!

2) Write the Conclusion

Now, let's write the conclusion of your poster. You have gone through all the steps. You know what to do next and why that's what you need to do. That essentially involves communicating:

  • A summary of the problem and research question: the problem makes a case for the research question;
  • What the data implies based on your Exploratory Data Analysis;
  • How to proceed or what to consider.

3) Referencing

You already learned how to refer to sources using APA citation in Digital Transformation. If you need a refresher, look at Scientific Writing 2. They're some good APA referencing tools available. Google Scholar is a good one. The Scribbr Citation Tool also seems to be good. There are other tools available as well. We're going to cite any sources used for our data science project in the conference poster.

4) Iterating using the CRISP-DM

The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model with six phases that naturally describes the data science life cycle. It's like a set of guardrails to help you plan, organize, and implement your data science (or machine learning) project.

  1. Business understanding – What does the business need?
  2. Data understanding – What data do we have/need? Is it clean?
  3. Data preparation – How do we organise the data for modelling?
  4. Modeling – What modelling techniques should we apply?
  5. Evaluation – Which model best meets the business objectives?
  6. Deployment – How do stakeholders access the results?


In our project, we only go to step 4, asking ourselves: What modelling technique should we apply? We don't actually model our data, evaluate our model or deploy our data science solution. We'll do these steps in the different projects throughout the year.

Watch the following video:

CRISP-DM: The data science cycle from Data Science for Java Developers by Shaun Wassell

Run through your analysis script, run it and re-evaluate it. Look at your conference poster and visualisations to see if there's anything that needs to be updated.

Done? Maybe you can help your peers by proofreading their work? They may return the favour, and you both get a better learning experience!

5) In-Class discussion

At 16:00, we'll all get together in Datalab to discuss our progress and reflect on today activities.

Tomorrow, we will introduce you to the differences between explanatory and predictive modelling!

Questions or issues?

If you have any questions, please first ask your peers or send us a message on teams instead!

Resources