← Back to Lessons
INTERMEDIATE πŸ“Š DATABASE DESIGN ⏱ 50 minutes

Designing a Normalized Schema

You've learned the theoryβ€”relational databases, anomalies, and normal forms. Now it's time to put it all together! In this lesson, you'll learn the practical process of designing a database schema from scratch. We'll walk through a complete real-world example, step by step, from gathering requirements to creating a fully normalized database structure. By the end, you'll be able to design professional database schemas on your own!


πŸ“š What You'll Learn

1. The 6-Step Database Design Process

The Process:

  1. Gather Requirements - Understand what the system needs to do
  2. Identify Entities - Find the "things" (nouns) in the system
  3. Identify Attributes - Determine what information each entity needs
  4. Identify Relationships - How entities connect to each other
  5. Choose Keys - Select primary keys and foreign keys
  6. Normalize - Apply 1NF, 2NF, 3NF to eliminate anomalies

Let's work through a complete example: A Library Management System

2. Step 1: Gather Requirements

Library System Requirements:

  • The library has many books
  • Each book has a title, ISBN, publication year, and publisher
  • Books are written by one or more authors
  • Authors have names and countries
  • Members can borrow books
  • Members have names, email addresses, and phone numbers
  • Each member can borrow multiple books at once
  • We need to track checkout date and return date for each loan
  • A book can only be checked out by one member at a time
  • We need to see borrowing history (who borrowed what and when)

πŸ’‘ Pro Tip: Write down requirements in plain English first. Look for nouns (entities), adjectives (attributes), and verbs (relationships). Don't think about tables yetβ€”just understand the business domain!

3. Step 2: Identify Entities

Entities are the "things" your system manages. They become tables in your database.

Finding Entities: Look for Nouns

Read through requirements and highlight important nouns:

The library has many books. Books are written by one or more authors. Members can borrow books. We need to track each loan.

Our Entities:

πŸ“š books

✍️ authors

πŸ‘€ members

πŸ“‹ loans

πŸ’‘ Entity Test: Can you say "I have many _____"? If yes, it's probably an entity. Can you say "I have many books"? Yes! "I have many loans"? Yes! That's your clue.

4. Step 3: Identify Attributes

Attributes are the pieces of information (properties) that describe each entity. They become columns.

books Table Attributes:

Attribute Data Type Notes
book_id INT Primary key, auto-increment
title VARCHAR(255) Book title
isbn VARCHAR(20) Unique identifier
publication_year INT Year published
publisher VARCHAR(255) Publisher name

authors Table Attributes:

Attribute Data Type Notes
author_id INT Primary key, auto-increment
name VARCHAR(255) Author's full name
country VARCHAR(100) Author's country

members Table Attributes:

Attribute Data Type Notes
member_id INT Primary key, auto-increment
name VARCHAR(255) Member's full name
email VARCHAR(255) Contact email
phone VARCHAR(20) Contact phone
join_date DATE When they joined

loans Table Attributes:

Attribute Data Type Notes
loan_id INT Primary key, auto-increment
checkout_date DATE When book was borrowed
due_date DATE When book should be returned
return_date DATE When book was actually returned (NULL if still out)

5. Step 4: Identify Relationships

Now we connect entities with relationships. Ask: "How does Entity A relate to Entity B?"

Relationship 1: Books ↔ Authors

Question: Can one book have multiple authors? YES (many textbooks do)
Question: Can one author write multiple books? YES (most authors do)

Relationship Type: MANY-TO-MANY

Solution: Create a junction table called book_authors

Relationship 2: Members ↔ Loans ↔ Books

Question: Can one member have multiple loans? YES (borrow multiple books)
Question: Does each loan belong to one member? YES

Relationship: ONE member β†’ MANY loans

Question: Can one book be in multiple loans (over time)? YES (loan history)
Question: Does each loan involve one book? YES

Relationship: ONE book β†’ MANY loans

members (1) ←→ (many) loans (many) ←→ (1) books

6. Step 5: Choose Primary and Foreign Keys

Primary Keys (Already Decided):

Foreign Keys (Connect Tables):

book_authors Table (Junction for Many-to-Many)

πŸ”— book_id (FK) πŸ”— author_id (FK)
1 101
1 102
2 101

Composite primary key: (book_id, author_id)

loans Table (Updated with Foreign Keys)

πŸ”‘ loan_id πŸ”— member_id (FK) πŸ”— book_id (FK) checkout_date due_date return_date
1001 5001 1 2024-01-15 2024-01-29 NULL

7. Step 6: Apply Normalization

Let's verify our design is in 3NF:

βœ… 1NF Check: Atomic Values

βœ… 2NF Check: No Partial Dependencies

βœ… 3NF Check: No Transitive Dependencies

πŸŽ‰ Our design is in 3NF! No anomalies, no redundancy, clean structure!

8. The Complete Schema

Visual Schema Diagram:

πŸ“š books
πŸ”‘ book_id
title
isbn
publication_year
publisher
↔
πŸ”— book_authors
πŸ”— book_id (FK)
πŸ”— author_id (FK)
↔
✍️ authors
πŸ”‘ author_id
name
country
πŸ‘€ members
πŸ”‘ member_id
name
email
phone
join_date
β†’
πŸ“‹ loans
πŸ”‘ loan_id
πŸ”— member_id (FK)
πŸ”— book_id (FK)
checkout_date
due_date
return_date
←
πŸ“š books
(connected via FK)

9. SQL Implementation

Here's the actual SQL to create this schema:

-- Books table
CREATE TABLE books (
    book_id INT PRIMARY KEY AUTO_INCREMENT,
    title VARCHAR(255) NOT NULL,
    isbn VARCHAR(20) UNIQUE,
    publication_year INT,
    publisher VARCHAR(255)
);

-- Authors table
CREATE TABLE authors (
    author_id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    country VARCHAR(100)
);

-- Junction table for many-to-many relationship
CREATE TABLE book_authors (
    book_id INT,
    author_id INT,
    PRIMARY KEY (book_id, author_id),
    FOREIGN KEY (book_id) REFERENCES books(book_id),
    FOREIGN KEY (author_id) REFERENCES authors(author_id)
);

-- Members table
CREATE TABLE members (
    member_id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    phone VARCHAR(20),
    join_date DATE DEFAULT CURRENT_DATE
);

-- Loans table
CREATE TABLE loans (
    loan_id INT PRIMARY KEY AUTO_INCREMENT,
    member_id INT NOT NULL,
    book_id INT NOT NULL,
    checkout_date DATE NOT NULL,
    due_date DATE NOT NULL,
    return_date DATE,
    FOREIGN KEY (member_id) REFERENCES members(member_id),
    FOREIGN KEY (book_id) REFERENCES books(book_id)
);

10. Validating Your Design

Test your schema against requirements:

Requirement How Schema Supports It
Track books with multiple authors βœ… book_authors junction table
Members borrow multiple books βœ… One member β†’ many loans
Track checkout and return dates βœ… loans table has all date fields
See borrowing history βœ… loans table keeps all historical records
One book, one borrower at a time βœ… Application logic: check return_date is NULL

11. Common Design Patterns

Pattern 1: Many-to-Many β†’ Junction Table

Whenever you have a many-to-many relationship, create a junction table with foreign keys to both sides.

Pattern 2: One-to-Many β†’ Foreign Key

Put the foreign key on the "many" side. (Example: loans has member_id FK)

Pattern 3: Lookup Tables

If you have repeated values (like "category"), create a separate categories table and reference it with FK.

Pattern 4: Timestamp Tracking

Add created_at and updated_at columns to most tables for auditing.

12. 🎯 Hands-On Practice Exercises

Now it's your turn! Apply what you've learned to design database schemas for real-world scenarios.

Exercise 1: Student Course Registration System

Requirements:

  • Students can enroll in multiple courses
  • Each course can have multiple students
  • Track enrollment date and grade for each student-course pair
  • Students have: student_id, name, email, major, enrollment_year
  • Courses have: course_id, course_code, title, credits, department

Your Task: Design the schema. How many tables do you need? What are the primary and foreign keys?

Show Solution

Tables Needed: 3

1. students
CREATE TABLE students (
    student_id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    major VARCHAR(100),
    enrollment_year INT
);
2. courses
CREATE TABLE courses (
    course_id INT PRIMARY KEY AUTO_INCREMENT,
    course_code VARCHAR(20) UNIQUE NOT NULL,
    title VARCHAR(255) NOT NULL,
    credits INT NOT NULL,
    department VARCHAR(100)
);
3. enrollments (Junction Table)
CREATE TABLE enrollments (
    enrollment_id INT PRIMARY KEY AUTO_INCREMENT,
    student_id INT NOT NULL,
    course_id INT NOT NULL,
    enrollment_date DATE NOT NULL,
    grade VARCHAR(2),
    FOREIGN KEY (student_id) REFERENCES students(student_id),
    FOREIGN KEY (course_id) REFERENCES courses(course_id),
    UNIQUE(student_id, course_id)  -- Prevent duplicate enrollments
);

Key Points:

  • βœ… Many-to-many relationship requires junction table
  • βœ… enrollments stores the relationship-specific data (date, grade)
  • βœ… UNIQUE constraint prevents a student from enrolling in the same course twice
  • βœ… Foreign keys enforce referential integrity
  • βœ… Design is in 3NF (no redundancy, no anomalies)

Exercise 2: E-Commerce Order System

Requirements:

  • Customers place orders
  • Each order contains multiple products
  • Track quantity and price-at-time-of-purchase for each product in an order
  • Customers have: customer_id, name, email, address
  • Products have: product_id, name, description, current_price, stock_quantity
  • Orders have: order_id, order_date, total_amount, status

Your Task: Design the complete schema. Pay attention to how you handle price (products have current_price, but orders need to remember the price at purchase time).

Show Solution

Tables Needed: 4

1. customers
CREATE TABLE customers (
    customer_id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    address TEXT
);
2. products
CREATE TABLE products (
    product_id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    description TEXT,
    current_price DECIMAL(10, 2) NOT NULL,
    stock_quantity INT NOT NULL DEFAULT 0
);
3. orders
CREATE TABLE orders (
    order_id INT PRIMARY KEY AUTO_INCREMENT,
    customer_id INT NOT NULL,
    order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    total_amount DECIMAL(10, 2) NOT NULL,
    status VARCHAR(50) NOT NULL DEFAULT 'pending',
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
4. order_items (Junction Table)
CREATE TABLE order_items (
    order_item_id INT PRIMARY KEY AUTO_INCREMENT,
    order_id INT NOT NULL,
    product_id INT NOT NULL,
    quantity INT NOT NULL,
    price_at_purchase DECIMAL(10, 2) NOT NULL,  -- Critical!
    FOREIGN KEY (order_id) REFERENCES orders(order_id),
    FOREIGN KEY (product_id) REFERENCES products(product_id)
);

Key Design Decisions:

  • βœ… price_at_purchase: Stores historical price (critical for order integrity!)
  • βœ… Products have current_price (can change), but orders remember purchase price
  • βœ… One-to-many: customer β†’ orders
  • βœ… Many-to-many: orders ↔ products (through order_items)
  • βœ… order_items is a junction table with additional data (quantity, price_at_purchase)
  • βœ… No redundancy: customer info stored once, product info stored once

⚠️ Common Mistake: DON'T reference products.current_price in reports! The price may have changed since the order was placed. Always use order_items.price_at_purchase for historical accuracy.

Exercise 3: Hospital Patient Management

Requirements:

  • Patients have appointments with doctors
  • Each appointment is on a specific date/time and has notes
  • Patients can have multiple appointments (with same or different doctors)
  • Doctors can see multiple patients
  • Track appointment status (scheduled, completed, cancelled)
  • Patients have: patient_id, name, date_of_birth, phone, address
  • Doctors have: doctor_id, name, specialty, license_number

Your Task: Design the schema. What type of relationship exists between patients and doctors?

Show Solution

Tables Needed: 3

1. patients
CREATE TABLE patients (
    patient_id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    date_of_birth DATE NOT NULL,
    phone VARCHAR(20),
    address TEXT
);
2. doctors
CREATE TABLE doctors (
    doctor_id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    specialty VARCHAR(100),
    license_number VARCHAR(50) UNIQUE NOT NULL
);
3. appointments (Junction Table)
CREATE TABLE appointments (
    appointment_id INT PRIMARY KEY AUTO_INCREMENT,
    patient_id INT NOT NULL,
    doctor_id INT NOT NULL,
    appointment_datetime DATETIME NOT NULL,
    status VARCHAR(20) NOT NULL DEFAULT 'scheduled',
    notes TEXT,
    FOREIGN KEY (patient_id) REFERENCES patients(patient_id),
    FOREIGN KEY (doctor_id) REFERENCES doctors(doctor_id)
);

Analysis:

  • βœ… Many-to-many relationship: patients ↔ doctors
  • βœ… appointments is the junction table connecting them
  • βœ… Junction table stores relationship-specific data (datetime, status, notes)
  • βœ… No UNIQUE constraint on (patient_id, doctor_id) because same patient can have multiple appointments with same doctor
  • βœ… Design supports appointment history

πŸ’‘ Real-World Extension: In production, you might add a departments table (doctors belong to departments) and a medications table (appointments prescribe medications). This shows how schemas grow organically!

πŸ”₯ Challenge Exercise: Social Media Platform

Requirements:

  • Users can create posts
  • Users can follow other users
  • Users can like posts
  • Users can comment on posts
  • Comments can have replies (comments on comments)
  • Track timestamps for all actions

Your Task: Design the complete schema. This is complexβ€”think carefully about relationships!

Show Solution

Tables Needed: 6

1. users
CREATE TABLE users (
    user_id INT PRIMARY KEY AUTO_INCREMENT,
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
2. posts
CREATE TABLE posts (
    post_id INT PRIMARY KEY AUTO_INCREMENT,
    user_id INT NOT NULL,
    content TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (user_id) REFERENCES users(user_id)
);
3. follows (Self-referencing Many-to-Many)
CREATE TABLE follows (
    follower_id INT NOT NULL,
    following_id INT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (follower_id, following_id),
    FOREIGN KEY (follower_id) REFERENCES users(user_id),
    FOREIGN KEY (following_id) REFERENCES users(user_id),
    CHECK (follower_id != following_id)  -- Can't follow yourself
);
4. likes
CREATE TABLE likes (
    user_id INT NOT NULL,
    post_id INT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (user_id, post_id),
    FOREIGN KEY (user_id) REFERENCES users(user_id),
    FOREIGN KEY (post_id) REFERENCES posts(post_id)
);
5. comments
CREATE TABLE comments (
    comment_id INT PRIMARY KEY AUTO_INCREMENT,
    user_id INT NOT NULL,
    post_id INT NOT NULL,
    parent_comment_id INT,  -- NULL = top-level, otherwise reply
    content TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (user_id) REFERENCES users(user_id),
    FOREIGN KEY (post_id) REFERENCES posts(post_id),
    FOREIGN KEY (parent_comment_id) REFERENCES comments(comment_id)
);

Advanced Concepts in This Design:

  • βœ… Self-referencing table: follows references users twice (follower & following)
  • βœ… Self-referencing foreign key: comments.parent_comment_id references comments.comment_id
  • βœ… CHECK constraint: Prevents users from following themselves
  • βœ… Composite primary keys: follows and likes use two columns as PK
  • βœ… Nullable FK: parent_comment_id can be NULL (top-level comments)
  • βœ… All relationships properly normalized (3NF)

πŸŽ“ If you understood this schema, you're ready for professional database design! This pattern appears in many real-world applications: social media, forums, organizational hierarchies, file systems, and more.

πŸ“ Key Takeaways

πŸ’‘ Final Tip: Start simple and add complexity only when needed. A well-normalized 3NF database that's easy to understand beats an over-engineered complex schema every time!

πŸš€ What's Next?

You now understand how to design normalized database schemas from scratch. But how do you enforce data integrity? How do you ensure that invalid data never enters your database in the first place? That's where constraints come in.

Up next: Database Constraints (Primary Keys, Foreign Keys, NOT NULL, UNIQUE, and CHECK)

← Previous: Normal Forms
Practice Now β†’