Something went wrong!
Hang in there while we get back on track
Social Media App Schema Evaluation
Evaluates a user-submitted social media schema with users, posts, follows, and likes tables. Demonstrates comprehensive feedback including missing fields, feed generation strategies, relationship cardinality, and PM-relevant trade-offs.
Input
What you provide to the skill
Evaluate my schema for a social media app: users, posts, follows, likes tables with user_id FKs
Output
What the skill generates for you
SCHEMA DESIGN FEEDBACK
What You Did Well
✅ Core entity separation: You correctly identified the four main entities (users, posts, follows, likes) needed for a basic social media platform
✅ Relationship structure awareness: Using user_id foreign keys shows you understand that entities need to reference each other
✅ Standard social media patterns: Following the conventional approach of separate tables for follows and likes rather than embedding in posts
Areas to Consider
1. Missing Critical Fields
What you did: Mentioned tables and user_id FKs, but didn’t specify other key fields
The issue:
- Users table needs authentication fields (email, password_hash, username)
- Posts need content fields (text, media_urls, created_at)
- Follows needs both follower_id and followed_id (not just user_id)
- Likes needs both user_id AND post_id to represent the relationship
Better approach:
users
├─ id (PK)
├─ username (unique)
├─ email (unique)
├─ password_hash
├─ bio, profile_pic_url
├─ created_at
posts
├─ id (PK)
├─ user_id (FK → users)
├─ content (text)
├─ media_urls (array or separate media table)
├─ like_count (denormalized for performance)
├─ created_at
follows
├─ id (PK)
├─ follower_id (FK → users)
├─ followed_id (FK → users)
├─ created_at
└─ UNIQUE(follower_id, followed_id)
likes
├─ id (PK)
├─ user_id (FK → users)
├─ post_id (FK → posts)
├─ created_at
└─ UNIQUE(user_id, post_id)
PM Trade-off:
Denormalizing like_count on posts table trades storage/consistency for read speed—feed queries don’t need to COUNT() every time.
2. Feed Generation Strategy
What you did: Basic schema structure
Missing consideration:
- How do you generate a user’s feed of posts from people they follow?
- Naive approach: JOIN follows → posts → users (slow at scale)
- Feed generation is the #1 performance bottleneck in social media apps
Better approach - Fan-out on write:
feed_items
├─ id (PK)
├─ user_id (FK → users) // whose feed this appears in
├─ post_id (FK → posts)
├─ created_at
└─ INDEX(user_id, created_at DESC)
When user A posts, write to feed_items for all followers. Reading feed = simple query on user’s feed_items.
PM Trade-off:
- Fan-out on write: Fast reads (critical for user experience), slower writes (acceptable delay)
- Fan-out on read: Fast writes, slow reads (bad UX when opening feed)
- Hybrid: Fan-out on write for users with <100k followers, on-demand aggregation for celebrities
3. Relationship Cardinality Clarity
What you did: “user_id FKs” in follows and likes
The issue:
- Follows is a many-to-many self-referencing relationship on users
- Requires TWO foreign keys: follower_id and followed_id, both pointing to users
- Likes is a many-to-many between users and posts
- Requires user_id and post_id
Better approach:
Be explicit about relationship types:
- users → posts: one-to-many (one user creates many posts)
- users → follows (as follower): many-to-many (many users follow many users)
- users → likes: many-to-many (many users like many posts)
- posts → likes: one-to-many (one post has many likes)
PM insight: In interviews, naming both sides of many-to-many relationships shows you understand data flows in both directions (e.g., “who follows me” vs “who do I follow”).
4. Comments and Nested Interactions
What you did: Posts, follows, likes only
Missing consideration:
Most social media apps need comments, which have their own complexity:
- Comments on posts (potentially nested/threaded)
- Likes on comments
- Replies to comments
Better approach:
comments
├─ id (PK)
├─ post_id (FK → posts)
├─ user_id (FK → users)
├─ parent_comment_id (FK → comments, nullable for top-level)
├─ content (text)
├─ like_count (denormalized)
├─ created_at
PM Trade-off:
Threaded comments (parent_comment_id) increase engagement but complicate feed rendering and notification logic.
5. Indexing Strategy
What you did: Defined tables
Missing consideration:
Schema isn’t complete without indexes for common queries:
Critical indexes:
follows(follower_id)- “Who do I follow?”follows(followed_id)- “Who follows me?”posts(user_id, created_at DESC)- “User’s posts, newest first”likes(post_id)- “All likes on a post”likes(user_id, post_id)- “Has this user liked this post?” (for UI state)
PM insight: Indexes are a product decision—they speed up specific queries but slow writes. Index the queries your users run most often.
Improved Schema + Trade-offs
CORE DATA (PostgreSQL)
======================
users
├─ id (PK)
├─ username (UNIQUE)
├─ email (UNIQUE)
├─ password_hash
├─ bio, profile_pic_url
├─ follower_count, following_count (denormalized)
└─ created_at
posts
├─ id (PK)
├─ user_id (FK → users)
├─ content (text, max 280 chars for Twitter-style)
├─ media_urls (JSON array or separate media table)
├─ like_count, comment_count (denormalized)
├─ created_at
└─ INDEX(user_id, created_at DESC)
follows (many-to-many self-join)
├─ id (PK)
├─ follower_id (FK → users)
├─ followed_id (FK → users)
├─ created_at
├─ UNIQUE(follower_id, followed_id)
├─ INDEX(follower_id)
└─ INDEX(followed_id)
likes (many-to-many junction)
├─ id (PK)
├─ user_id (FK → users)
├─ post_id (FK → posts)
├─ created_at
├─ UNIQUE(user_id, post_id)
├─ INDEX(post_id)
└─ INDEX(user_id, post_id)
comments
├─ id (PK)
├─ post_id (FK → posts)
├─ user_id (FK → users)
├─ parent_comment_id (FK → comments, nullable)
├─ content (text)
├─ like_count (denormalized)
├─ created_at
└─ INDEX(post_id, created_at)
FEED OPTIMIZATION (Redis or PostgreSQL)
========================================
feed_items
├─ id (PK)
├─ user_id (FK → users) // whose feed
├─ post_id (FK → posts)
├─ created_at
└─ INDEX(user_id, created_at DESC)
Trade-off Summary:
- Denormalized counts (follower_count, like_count): Fast reads, eventual consistency on writes
- Feed_items fan-out: Instant feed loads, write amplification for popular users
- Composite indexes: Speed up common queries, increase storage by ~15-30%
- Self-referencing follows table: Clean model, requires two indexes (follower, followed)
Key PM Insights
-
Social media is read-heavy: Users read feeds 100x more than they post. Optimize for reads (denormalize counts, pre-compute feeds) even if writes get slower.
-
Many-to-many requires junction tables: Follows and likes are both many-to-many relationships. Each needs a dedicated table with two foreign keys and a composite unique constraint.
-
Feed generation is the hardest problem: The naive JOIN approach doesn’t scale. Fan-out on write (pre-compute feeds) is the industry standard for good UX.
-
Indexes are product decisions: Every index speeds up specific queries but slows writes. Index what users do most (view feed, check likes) not admin operations (view all posts ever).
How to Discuss in Interviews
Good answer structure:
- Start with core entities and relationships (show structured thinking)
- Choose database type with reasoning (show trade-off awareness)
- Identify scaling challenges and solutions (show production mindset)
- Discuss one normalization trade-off (show nuance)
Example: “I’d start with four core tables: users, posts, follows, and likes. Follows is a many-to-many self-join on users with follower_id and followed_id. Likes is a many-to-many junction between users and posts. I’d use PostgreSQL for ACID guarantees and strong relationship modeling. For scaling, the biggest challenge is feed generation—joining follows to posts gets slow at scale. I’d implement fan-out on write by maintaining a feed_items table that pre-computes each user’s feed when someone they follow posts. This trades write speed for read speed, which is the right trade-off since users read feeds 100x more than they post. I’d also denormalize like_count onto posts to avoid expensive COUNTs, accepting eventual consistency for better UX.”
About This Skill
Practice database schema design for PM interviews with realistic product scenarios and detailed feedback.
View Skill DetailsMore Examples
E-Commerce Marketplace Schema Practice
Generates a realistic multi-seller e-commerce marketplace scenario with inventory, orders, payments, and reviews. Tests entity modeling for complex marketplace relationships including multi-seller orders and commission-based payments.
Advanced Event Sourcing and CQRS Scenario
Generates an advanced banking platform scenario requiring event sourcing with CQRS architecture. Tests understanding of write-side events vs read-side projections, event versioning, schema evolution, and regulatory audit requirements.