ERD (Entity-Relationship Diagram) is a visual representation of the DB structure and gives us a visual representation on how our different entities (tables) relate to each other.
We will be using https://www.lucidchart.com/ to create our RDBMS ERD design.
How does ERD help us?
It helps us visualize the DB structure and understand relationships between tables. We can plan our queries & data access patterns and makes it convenient to communicate the DB design with our team members and identify potential issues before implementation.
When we think about reviews, the core are creators and their content. This will lead us to 2 of our most fundamental entities: USERS and REVIEWS.
USERS and REVIEWS have a direct relationship, in which 1 user can create MANY posts.
Now, if we put in more thought and think about how readers interact with reviews. Readers don’t passively consume content - they engage in it. This brings us to our first layer of interaction features:
Next, we need to think about content organization. When we visit any review, we notice that content is organized in multiple ways to help readers find what they’re looking for.
We made these as a one-to-one relationship with REVIEWS because each review is for 1 specific model
Tags is many-to-many relationship as 1 review can belong to multiple categories (a post could be both “Fuel Efficient” and “Family Friendly”). Also, 1 category can contain MANY posts. Similarly, 1 post can have multiple tags, and 1 tag can be used on many posts.
For enhanced user engagement, we added social features:
Finally, we considered content management with the MEDIA table. Modern reviews aren’t just text- they include images as well. The MEDIA table connects to USERS (who uploaded the files) and can be referenced in POSTS
Each table has carefully chosen fields:
The BMW_MODELS table structure allows us to have sophisticated querying. For instance, to find all reviews for BMW i5 from 2024. we can apply the following SQP query
SELECT r.*, b.*
FROM REVIEWS r
JOIN BMW_MODELS b ON r.model_id = b.model_id
WHERE b.series = 'i5'
AND b.year = 2024;
With this structure we can gain insights about models and user engagement
-- Average rating by model and year
SELECT
b.model_name,
b.year,
AVG(r.rating) as avg_rating,
COUNT(r.review_id) as review_count
FROM BMW_MODELS b
LEFT JOIN REVIEWS r ON b.model_id = r.model_id
GROUP BY b.model_name, b.year;
ALTER TABLE REVIEWS
ADD CONSTRAINT fk_review_model
FOREIGN KEY (model_id)
REFERENCES BMW_MODELS(model_id)
ON DELETE RESTRICT;
We can add new features which can be easily accomodated.
A. Adding New BMW Models
ALTER TABLE BMW_MODELS
ADD COLUMN body_type VARCHAR(50),
ADD COLUMN electric BOOLEAN;
B. Enhanced Review Features:
CREATE TABLE RATING_CATEGORIES (
category_id INT PRIMARY KEY,
name VARCHAR(50) -- e.g., 'Performance', 'Comfort', 'Value'
);
CREATE TABLE REVIEW_DETAILED_RATINGS (
review_id INT,
category_id INT,
rating INT,
PRIMARY KEY (review_id, category_id),
FOREIGN KEY (review_id) REFERENCES REVIEWS(review_id),
FOREIGN KEY (category_id) REFERENCES RATING_CATEGORIES(category_id)
);
C. Social Features Expansion:
CREATE TABLE REVIEWER_EXPERTISE (
user_id INT,
expertise_level VARCHAR(50),
verified_date TIMESTAMP,
verification_proof TEXT,
PRIMARY KEY (user_id),
FOREIGN KEY (user_id) REFERENCES USERS(user_id)
);
The structure allows for efficient indexing:
-- For faster model searches
CREATE INDEX idx_bmw_models_year_series
ON BMW_MODELS(year, series);
-- For quick rating lookups
CREATE INDEX idx_reviews_rating
ON REVIEWS(rating);
Think of our DB like a massive library of BMW reviews. Without any organization system (indexes), finding a specific review would require looking through every single book (full table scan) - now imagine walking through every aisle, checking each book one by one. This is incredibly inefficient, especially as the library grows larger.
We can create indexes on frequently accessed columns.
CREATE INDEX idx_bmw_models_year_series ON BMW_MODELS(year, series);
This is like creating a card catalog in our library, organized by year and series. Now, when someone wants to find all 2024 i5 reviews, instead of checking every review, the DB can:
Without Index
EXPLAIN SELECT * FROM REVIEWS r
JOIN BMW_MODELS b ON r.model_id = b.model_id
WHERE b.year = 2024 AND b.series = 'i5';
-- This might scan all rows in both tables
With Index
CREATE INDEX idx_bmw_models_year_series ON BMW_MODELS(year, series);
-- After creating the index
EXPLAIN SELECT * FROM REVIEWS r
JOIN BMW_MODELS b ON r.model_id = b.model_id
WHERE b.year = 2024 AND b.series = 'i5';
-- This uses the index to find matching rows immediately
Our one-to-one relationships between REVIEWS and BMW_MODELS is particularly efficient. When a user views a review, we need to fetch both the review content and the car details. Because each review is linked to exactly one model through model_id, the DB can quickly locate the matching records without scanning multiple possible matches.
-- This join operation is efficient because:
-- 1. model_id is the primary key in BMW_MODELS
-- 2. It's a foreign key in REVIEWS
-- 3. Each review has exactly one model
SELECT r.title, r.content, r.rating,
b.model_name, b.year, b.specifications
FROM REVIEWS r
JOIN BMW_MODELS b ON r.model_id = b.model_id
WHERE r.review_id = 123;
Thus,
The ‘enum’ status in REVIEWS helps maintain content quality:
ALTER TABLE REVIEWS
ADD CONSTRAINT valid_status
CHECK (status IN ('draft', 'pending', 'published', 'archived'));
This structure is efficient in that