Designing a Normalized Schema
You've learned the theoryβrelational databases, anomalies, and normal forms. Now it's time to put it all together! In this lesson, you'll learn the practical process of designing a database schema from scratch. We'll walk through a complete real-world example, step by step, from gathering requirements to creating a fully normalized database structure. By the end, you'll be able to design professional database schemas on your own!
π What You'll Learn
- The 6-step process for designing a normalized database
- How to identify entities from business requirements
- How to identify attributes (columns) for each entity
- How to determine relationships between entities
- How to choose primary and foreign keys
- How to apply normalization rules systematically
- How to handle many-to-many relationships
- Common design patterns and best practices
- How to validate your design against requirements
- Complete worked example: Library Management System
1. The 6-Step Database Design Process
The Process:
- Gather Requirements - Understand what the system needs to do
- Identify Entities - Find the "things" (nouns) in the system
- Identify Attributes - Determine what information each entity needs
- Identify Relationships - How entities connect to each other
- Choose Keys - Select primary keys and foreign keys
- Normalize - Apply 1NF, 2NF, 3NF to eliminate anomalies
Let's work through a complete example: A Library Management System
2. Step 1: Gather Requirements
Library System Requirements:
- The library has many books
- Each book has a title, ISBN, publication year, and publisher
- Books are written by one or more authors
- Authors have names and countries
- Members can borrow books
- Members have names, email addresses, and phone numbers
- Each member can borrow multiple books at once
- We need to track checkout date and return date for each loan
- A book can only be checked out by one member at a time
- We need to see borrowing history (who borrowed what and when)
π‘ Pro Tip: Write down requirements in plain English first. Look for nouns (entities), adjectives (attributes), and verbs (relationships). Don't think about tables yetβjust understand the business domain!
3. Step 2: Identify Entities
Entities are the "things" your system manages. They become tables in your database.
Finding Entities: Look for Nouns
Read through requirements and highlight important nouns:
The library has many books. Books are written by one or more authors. Members can borrow books. We need to track each loan.
Our Entities:
π books
βοΈ authors
π€ members
π loans
π‘ Entity Test: Can you say "I have many _____"? If yes, it's probably an entity. Can you say "I have many books"? Yes! "I have many loans"? Yes! That's your clue.
4. Step 3: Identify Attributes
Attributes are the pieces of information (properties) that describe each entity. They become columns.
books Table Attributes:
| Attribute | Data Type | Notes |
|---|---|---|
| book_id | INT | Primary key, auto-increment |
| title | VARCHAR(255) | Book title |
| isbn | VARCHAR(20) | Unique identifier |
| publication_year | INT | Year published |
| publisher | VARCHAR(255) | Publisher name |
authors Table Attributes:
| Attribute | Data Type | Notes |
|---|---|---|
| author_id | INT | Primary key, auto-increment |
| name | VARCHAR(255) | Author's full name |
| country | VARCHAR(100) | Author's country |
members Table Attributes:
| Attribute | Data Type | Notes |
|---|---|---|
| member_id | INT | Primary key, auto-increment |
| name | VARCHAR(255) | Member's full name |
| VARCHAR(255) | Contact email | |
| phone | VARCHAR(20) | Contact phone |
| join_date | DATE | When they joined |
loans Table Attributes:
| Attribute | Data Type | Notes |
|---|---|---|
| loan_id | INT | Primary key, auto-increment |
| checkout_date | DATE | When book was borrowed |
| due_date | DATE | When book should be returned |
| return_date | DATE | When book was actually returned (NULL if still out) |
5. Step 4: Identify Relationships
Now we connect entities with relationships. Ask: "How does Entity A relate to Entity B?"
Relationship 1: Books β Authors
Question: Can one book have multiple authors? YES (many textbooks do)
Question: Can one author write multiple books? YES (most authors do)
Relationship Type: MANY-TO-MANY
Solution: Create a junction table called book_authors
Relationship 2: Members β Loans β Books
Question: Can one member have multiple loans? YES (borrow multiple books)
Question: Does each loan belong to one member? YES
Relationship: ONE member β MANY loans
Question: Can one book be in multiple loans (over time)? YES (loan history)
Question: Does each loan involve one book? YES
Relationship: ONE book β MANY loans
members (1) ββ (many) loans (many) ββ (1) books
6. Step 5: Choose Primary and Foreign Keys
Primary Keys (Already Decided):
- books: book_id
- authors: author_id
- members: member_id
- loans: loan_id
Foreign Keys (Connect Tables):
book_authors Table (Junction for Many-to-Many)
| π book_id (FK) | π author_id (FK) |
|---|---|
| 1 | 101 |
| 1 | 102 |
| 2 | 101 |
Composite primary key: (book_id, author_id)
loans Table (Updated with Foreign Keys)
| π loan_id | π member_id (FK) | π book_id (FK) | checkout_date | due_date | return_date |
|---|---|---|---|---|---|
| 1001 | 5001 | 1 | 2024-01-15 | 2024-01-29 | NULL |
7. Step 6: Apply Normalization
Let's verify our design is in 3NF:
β 1NF Check: Atomic Values
- β No multi-valued cells (like "author1, author2")
- β No repeating columns (like author1, author2, author3)
- β We created book_authors junction table instead
β 2NF Check: No Partial Dependencies
- β books: All columns depend on book_id (single key)
- β authors: All columns depend on author_id (single key)
- β members: All columns depend on member_id (single key)
- β loans: All columns depend on loan_id (single key)
- β book_authors: Composite key, but no extra columns to have partial dependencies
β 3NF Check: No Transitive Dependencies
- β No column depends on another non-key column
- β All relationships go through foreign keys, not embedded data
- β Example: loan doesn't store member_name, just member_id (FK)
π Our design is in 3NF! No anomalies, no redundancy, clean structure!
8. The Complete Schema
Visual Schema Diagram:
π book_id
title
isbn
publication_year
publisher
π book_id (FK)
π author_id (FK)
π author_id
name
country
π member_id
name
phone
join_date
π loan_id
π member_id (FK)
π book_id (FK)
checkout_date
due_date
return_date
(connected via FK)
9. SQL Implementation
Here's the actual SQL to create this schema:
-- Books table
CREATE TABLE books (
book_id INT PRIMARY KEY AUTO_INCREMENT,
title VARCHAR(255) NOT NULL,
isbn VARCHAR(20) UNIQUE,
publication_year INT,
publisher VARCHAR(255)
);
-- Authors table
CREATE TABLE authors (
author_id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
country VARCHAR(100)
);
-- Junction table for many-to-many relationship
CREATE TABLE book_authors (
book_id INT,
author_id INT,
PRIMARY KEY (book_id, author_id),
FOREIGN KEY (book_id) REFERENCES books(book_id),
FOREIGN KEY (author_id) REFERENCES authors(author_id)
);
-- Members table
CREATE TABLE members (
member_id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
phone VARCHAR(20),
join_date DATE DEFAULT CURRENT_DATE
);
-- Loans table
CREATE TABLE loans (
loan_id INT PRIMARY KEY AUTO_INCREMENT,
member_id INT NOT NULL,
book_id INT NOT NULL,
checkout_date DATE NOT NULL,
due_date DATE NOT NULL,
return_date DATE,
FOREIGN KEY (member_id) REFERENCES members(member_id),
FOREIGN KEY (book_id) REFERENCES books(book_id)
);
10. Validating Your Design
Test your schema against requirements:
| Requirement | How Schema Supports It |
|---|---|
| Track books with multiple authors | β book_authors junction table |
| Members borrow multiple books | β One member β many loans |
| Track checkout and return dates | β loans table has all date fields |
| See borrowing history | β loans table keeps all historical records |
| One book, one borrower at a time | β Application logic: check return_date is NULL |
11. Common Design Patterns
Pattern 1: Many-to-Many β Junction Table
Whenever you have a many-to-many relationship, create a junction table with foreign keys to both sides.
Pattern 2: One-to-Many β Foreign Key
Put the foreign key on the "many" side. (Example: loans has member_id FK)
Pattern 3: Lookup Tables
If you have repeated values (like "category"), create a separate categories table and reference it with FK.
Pattern 4: Timestamp Tracking
Add created_at and updated_at columns to most tables for auditing.
12. π― Hands-On Practice Exercises
Now it's your turn! Apply what you've learned to design database schemas for real-world scenarios.
Exercise 1: Student Course Registration System
Requirements:
- Students can enroll in multiple courses
- Each course can have multiple students
- Track enrollment date and grade for each student-course pair
- Students have: student_id, name, email, major, enrollment_year
- Courses have: course_id, course_code, title, credits, department
Your Task: Design the schema. How many tables do you need? What are the primary and foreign keys?
Show Solution
Tables Needed: 3
1. students
CREATE TABLE students (
student_id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
major VARCHAR(100),
enrollment_year INT
);
2. courses
CREATE TABLE courses (
course_id INT PRIMARY KEY AUTO_INCREMENT,
course_code VARCHAR(20) UNIQUE NOT NULL,
title VARCHAR(255) NOT NULL,
credits INT NOT NULL,
department VARCHAR(100)
);
3. enrollments (Junction Table)
CREATE TABLE enrollments (
enrollment_id INT PRIMARY KEY AUTO_INCREMENT,
student_id INT NOT NULL,
course_id INT NOT NULL,
enrollment_date DATE NOT NULL,
grade VARCHAR(2),
FOREIGN KEY (student_id) REFERENCES students(student_id),
FOREIGN KEY (course_id) REFERENCES courses(course_id),
UNIQUE(student_id, course_id) -- Prevent duplicate enrollments
);
Key Points:
- β Many-to-many relationship requires junction table
- β enrollments stores the relationship-specific data (date, grade)
- β UNIQUE constraint prevents a student from enrolling in the same course twice
- β Foreign keys enforce referential integrity
- β Design is in 3NF (no redundancy, no anomalies)
Exercise 2: E-Commerce Order System
Requirements:
- Customers place orders
- Each order contains multiple products
- Track quantity and price-at-time-of-purchase for each product in an order
- Customers have: customer_id, name, email, address
- Products have: product_id, name, description, current_price, stock_quantity
- Orders have: order_id, order_date, total_amount, status
Your Task: Design the complete schema. Pay attention to how you handle price (products have current_price, but orders need to remember the price at purchase time).
Show Solution
Tables Needed: 4
1. customers
CREATE TABLE customers (
customer_id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
address TEXT
);
2. products
CREATE TABLE products (
product_id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
description TEXT,
current_price DECIMAL(10, 2) NOT NULL,
stock_quantity INT NOT NULL DEFAULT 0
);
3. orders
CREATE TABLE orders (
order_id INT PRIMARY KEY AUTO_INCREMENT,
customer_id INT NOT NULL,
order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
total_amount DECIMAL(10, 2) NOT NULL,
status VARCHAR(50) NOT NULL DEFAULT 'pending',
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
4. order_items (Junction Table)
CREATE TABLE order_items (
order_item_id INT PRIMARY KEY AUTO_INCREMENT,
order_id INT NOT NULL,
product_id INT NOT NULL,
quantity INT NOT NULL,
price_at_purchase DECIMAL(10, 2) NOT NULL, -- Critical!
FOREIGN KEY (order_id) REFERENCES orders(order_id),
FOREIGN KEY (product_id) REFERENCES products(product_id)
);
Key Design Decisions:
- β price_at_purchase: Stores historical price (critical for order integrity!)
- β Products have current_price (can change), but orders remember purchase price
- β One-to-many: customer β orders
- β Many-to-many: orders β products (through order_items)
- β order_items is a junction table with additional data (quantity, price_at_purchase)
- β No redundancy: customer info stored once, product info stored once
β οΈ Common Mistake: DON'T reference products.current_price in reports! The price may have changed since the order was placed. Always use order_items.price_at_purchase for historical accuracy.
Exercise 3: Hospital Patient Management
Requirements:
- Patients have appointments with doctors
- Each appointment is on a specific date/time and has notes
- Patients can have multiple appointments (with same or different doctors)
- Doctors can see multiple patients
- Track appointment status (scheduled, completed, cancelled)
- Patients have: patient_id, name, date_of_birth, phone, address
- Doctors have: doctor_id, name, specialty, license_number
Your Task: Design the schema. What type of relationship exists between patients and doctors?
Show Solution
Tables Needed: 3
1. patients
CREATE TABLE patients (
patient_id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
date_of_birth DATE NOT NULL,
phone VARCHAR(20),
address TEXT
);
2. doctors
CREATE TABLE doctors (
doctor_id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
specialty VARCHAR(100),
license_number VARCHAR(50) UNIQUE NOT NULL
);
3. appointments (Junction Table)
CREATE TABLE appointments (
appointment_id INT PRIMARY KEY AUTO_INCREMENT,
patient_id INT NOT NULL,
doctor_id INT NOT NULL,
appointment_datetime DATETIME NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'scheduled',
notes TEXT,
FOREIGN KEY (patient_id) REFERENCES patients(patient_id),
FOREIGN KEY (doctor_id) REFERENCES doctors(doctor_id)
);
Analysis:
- β Many-to-many relationship: patients β doctors
- β appointments is the junction table connecting them
- β Junction table stores relationship-specific data (datetime, status, notes)
- β No UNIQUE constraint on (patient_id, doctor_id) because same patient can have multiple appointments with same doctor
- β Design supports appointment history
π‘ Real-World Extension: In production, you might add a departments table (doctors belong to departments) and a medications table (appointments prescribe medications). This shows how schemas grow organically!
π₯ Challenge Exercise: Social Media Platform
Requirements:
- Users can create posts
- Users can follow other users
- Users can like posts
- Users can comment on posts
- Comments can have replies (comments on comments)
- Track timestamps for all actions
Your Task: Design the complete schema. This is complexβthink carefully about relationships!
Show Solution
Tables Needed: 6
1. users
CREATE TABLE users (
user_id INT PRIMARY KEY AUTO_INCREMENT,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
2. posts
CREATE TABLE posts (
post_id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(user_id)
);
3. follows (Self-referencing Many-to-Many)
CREATE TABLE follows (
follower_id INT NOT NULL,
following_id INT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (follower_id, following_id),
FOREIGN KEY (follower_id) REFERENCES users(user_id),
FOREIGN KEY (following_id) REFERENCES users(user_id),
CHECK (follower_id != following_id) -- Can't follow yourself
);
4. likes
CREATE TABLE likes (
user_id INT NOT NULL,
post_id INT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (user_id, post_id),
FOREIGN KEY (user_id) REFERENCES users(user_id),
FOREIGN KEY (post_id) REFERENCES posts(post_id)
);
5. comments
CREATE TABLE comments (
comment_id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT NOT NULL,
post_id INT NOT NULL,
parent_comment_id INT, -- NULL = top-level, otherwise reply
content TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(user_id),
FOREIGN KEY (post_id) REFERENCES posts(post_id),
FOREIGN KEY (parent_comment_id) REFERENCES comments(comment_id)
);
Advanced Concepts in This Design:
- β Self-referencing table: follows references users twice (follower & following)
- β Self-referencing foreign key: comments.parent_comment_id references comments.comment_id
- β CHECK constraint: Prevents users from following themselves
- β Composite primary keys: follows and likes use two columns as PK
- β Nullable FK: parent_comment_id can be NULL (top-level comments)
- β All relationships properly normalized (3NF)
π If you understood this schema, you're ready for professional database design! This pattern appears in many real-world applications: social media, forums, organizational hierarchies, file systems, and more.
π Key Takeaways
- Follow the 6-step process: Requirements β Entities β Attributes β Relationships β Keys β Normalize
- Look for nouns to find entities (things you manage)
- Every entity needs a primary key (usually auto-increment ID)
- Foreign keys create relationships between tables
- Many-to-many relationships require a junction table
- Apply 1NF, 2NF, 3NF to eliminate anomalies
- Validate your design against original requirements
- Use standard naming conventions (table_name, column_name)
- Design for data integrity, not just current features
- Good design is iterativeβrefine as you learn more
π‘ Final Tip: Start simple and add complexity only when needed. A well-normalized 3NF database that's easy to understand beats an over-engineered complex schema every time!
π What's Next?
You now understand how to design normalized database schemas from scratch. But how do you enforce data integrity? How do you ensure that invalid data never enters your database in the first place? That's where constraints come in.
Up next: Database Constraints (Primary Keys, Foreign Keys, NOT NULL, UNIQUE, and CHECK)