For the second half of the day, you will dive into the foundational concepts of database design, and Structured Query Language (SQL). You will start with the Design Databases With PostgreSQL course in Codeacademy, and work your way through some additional SQL exercises, which focus on Data Definition Language (DDL) and Data Manipulation Language (DML) specific clauses.
Figure 1. Types of SQL.
1 Complete the What is a Database?
module in Codeacademy, which you can find, here.
A database is an organized collection of structured information, or data, typically stored in a computer system, such as your laptop. It is usually controlled by a database management system, also known as DBMS. Together, the data and the DBMS, along with the applications that are associated with them (e.g. PostgreSQL client), are referred to as a database system, often shortened to database. In this section, you will explore the different types of databases, and their corresponding strengths and weaknesses.
2a a) What is a SQL or relational database, and b) how does it differ from a so-called NoSQL database? Explain your answer
2b List three advantages/disadvantages of using a relational database. Write your answer down.
2c The start-up HomeRobot is looking for a lightweight relational database management system (RDBMS) that they can embed in their Internet-of-Things (IoT) application. They have asked you, as a seasoned data engineer/analyst, for advice; which RDBMS would you recommend to them? Explain your answer.
For this section of the Project Brief, you are going to install the PostgreSQL database server, and a so-called database client (e.g. PgAdmin).
3a Watch the video Introduction to PostgreSQL by Adam Wilbert on LinkedIn Learning, and read the article What is PostgreSQL?.
Video 1. Introduction to PostgreSQL by Adam Wilbert.
3b Install the PostgreSQL database server:
3c Install a PostgreSQL compatible client:
3d Connect PostgreSQL client to server. For PgAdmin, see online tutorial.
4a What happens if you try to create a table with an existing name? Write your answer down.
4b Identify, describe, and subsequently correct the error(s) in the following SQL statements.
Example 1a:
CREATE TABLE youth (
ClientKey INTEGER PRIMARY KEY,
ToewijzingKey INTEGER,
ClientVoornamen TEXT,
NotaRegelBedrag NUMERIC,
GeboorteJaar INTEGER NOT NULL
);
INSERT INTO youth
VALUES
(
1, 'Jan', 24, 4657, 2014
),
(
1, 'Jan', 51, 45, 2014
),
(
2, 'Anne', 205, 6778, 2008
);
Example 1b:
CREATE TABLE youth (
ClientKey INTEGER PRIMARY KEY,
ToewijzingKey INTEGER,
ClientVoornamen TEXT,
NotaRegelBedrag NUMERIC,
GeboorteJaar INTEGER NOT NULL
);
INSERT INTO youth VALUES (1, 'Jan', 24, 4657, 2014);
INSERT INTO youth VALUES (1, 'Jan', 51, 45, 2014);
INSERT INTO youth VALUES (2, 'Anne', 205, 6778, 2008);
Example 2:
INSERT INTO youth VALUES (11, Bert, 106, 2234.50, 2005)
Example 3:
DELETE ClientKey FROM youth
WHERE youth = 11;
4c What are constraints? List three of them, and explain why it would be beneficial to apply such as constraint?
4d After placing a UNIQUE constraint on one of the variables (named ‘unique_constraint') in the table above, the data analyst of the municipality of Oosterhout, receives the following error message in his PostgreSQL client: ERROR: could not create unique index "unique_constraint". Can you explain the error? What could have possibly gone wrong? Multiple answers possible.
Do you have time left, and have the desire to become a data wrangling SQL wizard, explore the following additional resources:
Interactive exercises:
Instructional videos:
Books/articles/webpages etc.:
Batra, R. (2018). SQL primer: An accelerated introduction to SQL basics. (Freely available through BUAS' MetaSearch).