You're scrolling through Facebook when you see it—someone types "@" and instantly gets a dropdown of friends to mention. Click a name, they get notified. Simple, elegant, instant.
Your company's CEO sees the same thing. "We need this," she says. "Our users should be able to mention each other too."
You think: "Yeah, this looks straightforward. Just detect @username and send a notification, right?"
But here's what you don't see when you use Facebook's mentions:
The real-time search index scanning 3 billion users in under 50ms
The privacy engine checking 47 different permission rules before showing a name
The notification system handling 500 million mentions per day
The spam filter blocking 50 million fake mention attempts
The distributed architecture spanning 15 data centers
What looks simple on the surface is actually one of the most complex features in modern social platforms.
Let me walk you through building this feature—step by step—so you understand why that "simple" @mention costs Facebook millions of dollars a year to run.
Week 1: The Naive Approach
You start coding. The logic seems obvious:
What You Think It Takes
graph TD
A[User types @john] --> B[Find 'john' in database]
B --> C[Show john's profile link]
C --> D[Send notification to john]
D --> E[Done! Ship it!]
style E fill:#90EE90Your Mental Model:
Step 1: Parse the text for anything starting with @
Step 2: Look up that username in the users table
Step 3: Create a clickable link
Step 4: Insert a notification: "Alice mentioned you"
You estimate: 2 days of work.
The First Implementation
You build a simple flow:
sequenceDiagram
participant User
participant Frontend
participant API
participant Database
User->>Frontend: Types: "Hey @john, check this out"
Frontend->>API: POST /create-post
API->>API: Scan text for @mentions
API->>Database: SELECT * FROM users WHERE username = 'john'
Database-->>API: User found: john (id: 12345)
API->>Database: INSERT INTO notifications
API->>Database: INSERT INTO posts (with mention link)
API-->>Frontend: Success!
Frontend-->>User: Post publishedYou deploy it on Friday afternoon. It works! Users love it.
Then Monday happens.
Monday Morning: Reality Hits
Your phone explodes with alerts at 6 AM.
The Problems:
Problem 1: The Autocomplete Problem
Users are complaining: "Why do I have to type the exact username?"
On Facebook, when you type "@", you see suggestions. When you type "@jo", you see everyone whose name starts with "jo". It's instant. It's intuitive.
Your system makes users type the complete, exact username. If someone's username is "john_smith_1990", good luck remembering that.
graph TD
A[What Users Expect] --> B[Type: @jo]
B --> C[Instant dropdown with:]
C --> D[John Smith]
C --> E[Joanna Lee]
C --> F[Joseph Wang]
G[What You Built] --> H[Type: @john_smith_1990]
H --> I[Hope you remember the exact username]
style A fill:#90EE90
style G fill:#ffccccWhy This Is Hard:
You need to search users in real-time as they type. Let's break down what "real-time" actually means:
User types "@j" → Show results in under 100ms
User types "@jo" → Update results in under 100ms
User types "@joh" → Update results in under 100ms
Each keystroke triggers a search. If 10,000 users are typing mentions simultaneously, that's potentially 10,000 database queries per second.
Your current database query:
SELECT * FROM users WHERE username LIKE 'jo%'
This full table scan on 10 million users takes 3 seconds. Completely unusable.
Problem 2: The Database Is Melting
Your mentions feature launched. Users love it. They're mentioning each other everywhere—in posts, comments, replies.
Your database CPU: 95%. Response time: 4 seconds. Support tickets: 47.
graph TD
A[10,000 concurrent users] --> B[All typing mentions]
B --> C[Each keystroke = DB query]
C --> D[10,000 queries per second]
D --> E[Database: On fire 🔥]
E --> F[Everything slows down]
F --> G[Users can't load pages]
G --> H[CEO asks: What happened?]
style E fill:#ff6666
style H fill:#ff6666What's Happening:
Every "@" keystroke hits your main database. Your users table has:
10 million users
No proper indexes for prefix search
No caching layer
No query optimization
You're doing a full table scan on every keystroke. This doesn't scale beyond 100 concurrent users.
Problem 3: You're Mentioning People Who Shouldn't Be Mentioned
Bug report: "I just got mentioned in a private group I'm not part of."
Another: "Someone mentioned me in a comment, but I blocked them."
Another: "I got mentioned in an internal company post, but I'm a customer, not an employee."
graph TD
A[Alice mentions @bob] --> B[In a private group]
C[Bob is NOT in that group] --> D[Bob gets notification anyway]
E[Charlie mentions @david] --> F[But David blocked Charlie]
G[David gets notification anyway] --> H[Privacy violation]
style D fill:#ffcccc
style H fill:#ffccccThe Privacy Nightmare:
You built a feature without considering:
Privacy settings (who can mention whom?)
Block lists (blocked users shouldn't get notifications)
Group membership (can't mention people outside the group)
Account visibility (private accounts, deactivated accounts)
Role-based access (employees vs customers vs partners)
This isn't just a bug. This is a privacy violation. In some jurisdictions, this could be illegal.
Week 2: Rebuilding the Foundation
You realize this needs a proper architecture. Let's solve these problems one by one.
Solution 1: Real-Time Autocomplete Architecture
You need to search millions of users in milliseconds. Here's how.
The New Search Architecture
graph TB
subgraph "User's Browser"
A[User types @jo]
end
subgraph "API Layer"
B[API Gateway]
C[Rate Limiter]
end
subgraph "Search Layer"
D[Elasticsearch Cluster]
E[Search Index]
F[Fuzzy Matching]
end
subgraph "Cache Layer"
G[Redis Cache]
H[Recent Searches]
end
subgraph "Data Layer"
I[(User Database)]
end
A --> B
B --> C
C --> G
G --> |Cache miss| D
D --> E
E --> F
F --> |Index rebuild| I
style D fill:#ffffcc
style G fill:#ccffffHow It Works:
Step 1: Elasticsearch for Fast Search
Instead of querying your main database, you build a dedicated search index:
graph LR
A[User Database] --> B[Change Data Capture]
B --> C[Kafka Stream]
C --> D[Elasticsearch Indexer]
D --> E[Search Index]
F[Search Query: @jo] --> E
E --> G[Results in 20ms]
style E fill:#90EE90
style G fill:#90EE90What Gets Indexed:
Username
Display name
Email (for finding coworkers)
Alternate names/nicknames
Search keywords
Why Elasticsearch?
Built for full-text search
Handles prefix matching efficiently
Supports fuzzy matching (handles typos)
Can scale to billions of documents
Returns results in milliseconds
Step 2: Multi-Tier Caching
Even Elasticsearch can't handle 100,000 queries per second from the same popular search terms.
graph TD
A[User types @john] --> B{Check Browser Cache}
B --> |Hit| C[Return cached results - 0ms]
B --> |Miss| D{Check Redis}
D --> |Hit| E[Return cached results - 2ms]
D --> |Miss| F{Check Elasticsearch}
F --> G[Return fresh results - 20ms]
G --> H[Update caches]
style C fill:#90EE90
style E fill:#ffffcc
style G fill:#ffd699Three Cache Layers:
Browser Cache (0ms latency)
Stores recent mentions for 5 minutes
Perfect for when users repeatedly mention the same people
Redis Cache (2ms latency)
Stores popular search results
Key: search term, Value: list of matching users
TTL: 30 seconds
Elasticsearch (20ms latency)
Full search capability
Updated in real-time via Kafka
Step 3: Smart Ranking
When you type "@jo", there might be 10,000 matching users. Which ones do you show first?
graph TD
A[Search: @jo] --> B[10,000 matches found]
B --> C[Ranking Algorithm]
C --> D[Score: Frequency<br/>How often do you mention them?]
C --> E[Score: Recency<br/>When did you last mention them?]
C --> F[Score: Relationship<br/>Are they your friend/follower?]
C --> G[Score: Activity<br/>Are they active users?]
C --> H[Score: Context<br/>Are they in this group/thread?]
D --> I[Weighted Score]
E --> I
F --> I
G --> I
H --> I
I --> J[Top 10 Results]
style J fill:#90EE90Ranking Formula:
Each user gets a score:
Score = (Frequency × 3) + (Recency × 5) + (Relationship × 2) + (Activity × 1) + (Context × 10)
Why Context Weighs Most:
If you're in a group chat, members of that group appear first
If you're commenting on Alice's post, Alice appears first
If you're in a work channel, coworkers appear first
This is why Facebook's mentions feel "smart"—they're predicting who you want to mention.
The Performance Numbers
graph LR
A[Before: Database Query] --> B[3000ms per search]
C[After: Elasticsearch + Cache] --> D[20ms per search]
E[150x faster]
style B fill:#ffcccc
style D fill:#90EE90
style E fill:#90EE90But now you have a new problem: keeping the search index in sync with your database.
The Synchronization Challenge
sequenceDiagram
participant User
participant Database
participant CDC
participant Kafka
participant Elasticsearch
User->>Database: Update profile name
Database->>Database: Write to users table
Database->>CDC: Detect change
CDC->>Kafka: Publish user.updated event
Kafka->>Elasticsearch: Update search index
Note over Database,Elasticsearch: Propagation delay: 100-500ms
User->>Elasticsearch: Search for new name
Elasticsearch-->>User: Old name (not synced yet!)The Problem: There's a delay between updating your profile and that change appearing in search results. This is called "eventual consistency."
The Trade-off: You could make the update synchronous (wait for Elasticsearch to update before confirming), but then profile updates take 10x longer.
Most platforms choose: Fast updates + eventual consistency > Slow updates + immediate consistency
Users don't notice a 500ms delay in search results. They definitely notice waiting 5 seconds to update their profile.
Solution 2: The Privacy Engine
Now let's tackle the privacy nightmare. Who can mention whom?
The Privacy Rules Matrix
Before showing someone in autocomplete or sending them a notification, you need to check dozens of rules:
graph TD
A[Can Alice mention Bob?] --> B{Is Bob's account active?}
B --> |No| C[❌ Don't show Bob]
B --> |Yes| D{Has Bob blocked Alice?}
D --> |Yes| C
D --> |No| E{Has Alice blocked Bob?}
E --> |Yes| C
E --> |No| F{Privacy Setting: Who can mention me?}
F --> G{Everyone}
F --> H{Friends Only}
F --> I{Nobody}
G --> J{Context Check}
H --> K{Are they friends?}
I --> C
K --> |No| C
K --> |Yes| J
J --> L{Is this a private group?}
L --> |Yes| M{Is Bob a member?}
L --> |No| N[✅ Show Bob]
M --> |No| C
M --> |Yes| N
style C fill:#ffcccc
style N fill:#90EE90The Rule Categories:
Account Status Rules
Is the account active?
Is it deactivated?
Is it banned/suspended?
Is it a deleted account?
Block Rules
Has either user blocked the other?
Are they on each other's restricted lists?
Privacy Settings
"Who can mention me?" (Everyone / Friends / Nobody)
"Who can see my posts?" (affects visibility)
"Who can look me up?" (affects search)
Context Rules
Private group: Must be a member
Company workspace: Must be an employee
Direct message: Must have permission to DM
Age-restricted content: Age verification required
Relationship Rules
Are they friends/followers?
Do they follow each other?
Have they ever interacted?
The Architecture
You can't check these rules on every keystroke. It would be way too slow.
graph TB
subgraph "Real-time Path"
A[User types @jo] --> B[Query Elasticsearch]
B --> C[Get 100 candidate users]
end
subgraph "Privacy Filter"
C --> D[Load privacy rules from cache]
D --> E[Filter candidates in parallel]
E --> F[Apply account status rules]
E --> G[Apply block rules]
E --> H[Apply privacy settings]
E --> I[Apply context rules]
F --> J[Filtered results]
G --> J
H --> J
I --> J
end
subgraph "Result"
J --> K[Return 10 valid users]
end
style K fill:#90EE90How to Make This Fast:
Strategy 1: Pre-compute What You Can
graph LR
A[User updates privacy settings] --> B[Background Job]
B --> C[Update privacy cache]
C --> D[Redis: user:123:privacy]
E[Mention search happens] --> F[Read from cache]
F --> D
D --> G[Apply rules in memory]
style G fill:#90EE90Store privacy settings in Redis with keys like:
user:123:mention_privacy→ "friends_only"user:123:blocked_users→ [456, 789, 1011]group:999:members→ [123, 456, 789]
Strategy 2: Fail Fast
Check the simplest rules first:
graph TD
A[100 candidates] --> B[Check account status]
B --> C[20 accounts inactive ❌]
D[80 remaining] --> E[Check blocks]
E --> F[5 blocked users ❌]
G[75 remaining] --> H[Check privacy settings]
H --> I[30 have mentions disabled ❌]
J[45 remaining] --> K[Check context rules]
K --> L[35 not in this group ❌]
M[10 remaining] --> N[✅ Show these users]
style C fill:#ffcccc
style F fill:#ffcccc
style I fill:#ffcccc
style L fill:#ffcccc
style N fill:#90EE90By checking cheap rules first (cached data), you eliminate most candidates before running expensive checks (database queries).
Strategy 3: Batch Check Context Rules
Instead of checking "Is Bob in this group?" for each candidate individually:
graph LR
A[10 candidate users] --> B[Single query:<br/>Get all group members]
B --> C[Filter candidates<br/>against member list]
D[Much faster than<br/>10 individual queries]
style C fill:#90EE90One query to get all group members, then filter in memory. Way faster than 10 separate queries.
The Caching Strategy
graph TB
subgraph "Hot Cache - Redis"
A[User privacy settings<br/>TTL: 5 minutes]
B[Block lists<br/>TTL: 10 minutes]
C[Group memberships<br/>TTL: 1 minute]
end
subgraph "Warm Cache - Application Memory"
D[Recently checked permissions<br/>TTL: 30 seconds]
end
subgraph "Cold Storage - Database"
E[Source of truth<br/>Always consistent]
end
F[Permission Check] --> D
D --> |Miss| A
A --> |Miss| E
style D fill:#90EE90
style A fill:#ffffcc
style E fill:#ffd699Cache Invalidation Strategy:
When someone blocks a user or changes privacy settings:
Update database (source of truth)
Invalidate cache immediately
Next request fetches fresh data
This means there might be a 1-30 second window where stale data exists. For privacy, that's acceptable—the worst case is someone appears in autocomplete briefly but the notification won't send (because that checks fresh data).
Solution 3: The Notification System
Now the hardest part: actually sending notifications when someone gets mentioned.
The Scale Problem
Let's say you're building Twitter-scale mentions:
500 million mentions per day
That's 5,787 mentions per second
Peak hours: 15,000 mentions per second
You can't just INSERT INTO notifications for each mention. Your database would collapse.
The Architecture Evolution
Phase 1: Synchronous (Doesn't Scale)
sequenceDiagram
participant User
participant API
participant Database
User->>API: Post with @john @sarah @mike
API->>Database: Create post
API->>Database: INSERT notification for john
API->>Database: INSERT notification for sarah
API->>Database: INSERT notification for mike
API-->>User: Success (took 300ms)Problems:
Every mention = database write
User waits for all notifications to be created
What if one notification fails? Roll back the post?
Database can't handle 15,000 writes per second
Phase 2: Async Queue (Better)
sequenceDiagram
participant User
participant API
participant Queue
participant Worker
participant Database
User->>API: Post with @john @sarah @mike
API->>Database: Create post
API->>Queue: Publish mention event
API-->>User: Success (took 50ms)
Note over Queue: Events queued
Worker->>Queue: Pull events
Worker->>Database: Batch create notificationsBenefits:
User gets instant response
Failures don't affect user experience
Can batch notifications for efficiency
Can scale workers independently
Phase 3: The Real Architecture
graph TB
subgraph "Ingestion Layer"
A[API Servers] --> B[Kafka Topic:<br/>mention.created]
end
subgraph "Processing Layer"
B --> C[Validation Worker]
C --> D[Privacy Check Worker]
D --> E[Notification Worker]
end
subgraph "Delivery Layer"
E --> F[Notification Service]
F --> G[Push Notification]
F --> H[Email]
F --> I[In-App Bell]
F --> J[SMS - for VIPs]
end
subgraph "Storage Layer"
E --> K[(Notification DB)]
K --> L[Read API]
end
style F fill:#ffffccThe Event Flow
Step 1: Mention Detection
When a post is created, extract all mentions:
graph LR
A[Text: Hey @john and @sarah,<br/>check out @mike's work] --> B[Regex Parser]
B --> C[Found: john, sarah, mike]
C --> D[Validate usernames exist]
D --> E[Publish event for each valid mention]
style E fill:#90EE90Event Structure:
{
"event_id": "evt_123",
"type": "mention.created",
"timestamp": "2024-11-20T10:30:00Z",
"data": {
"mentioned_user_id": "12345",
"mentioned_by_user_id": "67890",
"content_type": "post",
"content_id": "post_999",
"context": {
"group_id": "grp_111",
"thread_id": null
}
}
}
Step 2: Privacy Validation
Before sending notification, validate permissions:
graph TD
A[Mention Event] --> B{Privacy Check}
B --> C{User active?}
C --> |No| D[❌ Drop event]
C --> |Yes| E{Blocked?}
E --> |Yes| D
E --> |No| F{Has permission?}
F --> |No| D
F --> |Yes| G{Context valid?}
G --> |No| D
G --> |Yes| H[✅ Create notification]
style D fill:#ffcccc
style H fill:#90EE90This is where 40% of mention events get filtered out.
Step 3: Notification Creation
sequenceDiagram
participant Worker
participant NotificationDB
participant UserPrefsCache
participant DeliveryQueue
Worker->>UserPrefsCache: Get notification preferences
UserPrefsCache-->>Worker: Email: ON, Push: ON, SMS: OFF
Worker->>NotificationDB: Create notification record
Worker->>DeliveryQueue: Queue email delivery
Worker->>DeliveryQueue: Queue push notification
Note over DeliveryQueue: Separate workers handle<br/>each delivery channelStep 4: Delivery
Different channels have different priorities:
graph TD
A[Notification Created] --> B{Delivery Channels}
B --> C[In-App<br/>Immediate]
B --> D[Push Notification<br/>Within 1 second]
B --> E[Email<br/>Within 1 minute]
B --> F[SMS<br/>Only for VIPs]
C --> G[WebSocket to connected clients]
D --> H[FCM/APNS]
E --> I[Email queue - batched]
F --> J[Twilio API]
style C fill:#90EE90
style D fill:#ffffcc
style E fill:#ffd699
style F fill:#ffccccHandling the Celebrity Problem
What happens when @elonmusk tweets "Thanks @everyone who attended"?
If a user with 100 million followers mentions someone popular, you might need to fanout that notification to millions of people.
graph TD
A[@celebrity mentions @popular_person] --> B[100M followers need notification]
B --> C{Naive Approach}
C --> D[Create 100M notification records]
D --> E[💥 Database explodes]
B --> F{Smart Approach}
F --> G[Create 1 notification template]
G --> H[Lazy loading on read]
H --> I[Generate notification on demand]
style E fill:#ffcccc
style I fill:#90EE90The Solution: Lazy Materialization
Instead of creating 100M records:
Store one "event" record
When user opens their notifications, check: "Are there any events for me?"
Generate notification UI on the fly
Mark as "seen" per user in a separate table
This is why Twitter can handle viral tweets without melting their infrastructure.
Deduplication
Users hate duplicate notifications. But with distributed systems, duplicates happen:
sequenceDiagram
participant Worker1
participant Worker2
participant Database
Note over Worker1,Worker2: Both workers process same event<br/>(network retry caused duplicate)
Worker1->>Database: Create notification for user 123
Worker2->>Database: Create notification for user 123
Note over Database: Now user has 2 identical notifications!Solution: Idempotency Keys
graph LR
A[Mention Event ID:<br/>evt_abc123] --> B[Generate idempotency key:<br/>hash event_id + user_id]
B --> C[Try to INSERT with unique key]
C --> D{Already exists?}
D --> |Yes| E[Skip - already processed]
D --> |No| F[Insert notification]
style E fill:#ffffcc
style F fill:#90EE90The database enforces uniqueness on the idempotency key, so duplicate processing is automatically prevented.
Solution 4: The Cross-Platform Challenge
Here's where it gets really complex. Mentions need to work everywhere:
graph TD
A[Mentions Must Work In:] --> B[Posts]
A --> C[Comments]
A --> D[Replies to comments]
A --> E[Direct Messages]
A --> F[Group Chats]
A --> G[Live Streams]
A --> H[Stories]
A --> I[Video Descriptions]
A --> J[Photo Captions]
B --> K[Each has different:<br/>- Privacy rules<br/>- Context<br/>- Notification style]
C --> K
D --> K
E --> K
F --> KThe Context Problem
The same mention has different meanings in different contexts:
graph TD
A[@john mentioned in...] --> B{Context}
B --> C[Public Post]
C --> D[Anyone can mention anyone]
B --> E[Private Group]
E --> F[Only group members can mention each other]
B --> G[Direct Message]
G --> H[Must have DM permission]
B --> I[Company Workspace]
I --> J[Only employees can mention employees]
B --> K[Comment Thread]
K --> L[Inherit permissions from parent post]
style D fill:#90EE90
style F fill:#ffffcc
style H fill:#ffffcc
style J fill:#ffffcc
style L fill:#ffd699The Unified Architecture
Instead of building mention logic separately for each feature, you need a unified system:
graph TB
subgraph "Content Sources"
A[Posts API]
B[Comments API]
C[Messages API]
D[Stories API]
end
subgraph "Mention Service - Core"
E[Mention Parser]
F[Context Analyzer]
G[Privacy Engine]
H[Notification Engine]
end
subgraph "Support Services"
I[User Search Service]
J[Permission Service]
K[Notification Service]
end
A --> E
B --> E
C --> E
D --> E
E --> F
F --> G
G --> H
E --> I
G --> J
H --> KThe Interface Contract:
Every content type that wants to support mentions must provide:
Content context (where is this? public/private? group ID?)
Author ID (who is mentioning?)
Mentioned user IDs (who got mentioned?)
Permission callback (how to check if mention is allowed in this context?)
This way, the core mention logic stays the same, but context-specific rules are pluggable.
Cross-Platform Notification Rendering
The same mention notification looks different depending on where you see it:
graph TD
A[Notification:<br/>Alice mentioned you] --> B{Platform}
B --> C[Mobile App]
C --> D[Push: Alice mentioned you in Gaming Group]
B --> E[Email]
E --> F[Subject: Alice mentioned you<br/>Body: Full context + reply button]
B --> G[Web]
G --> H[Bell icon counter<br/>+ dropdown with preview]
B --> I[Desktop App]
I --> J[Native notification banner]
B --> K[SMS - VIP only]
K --> L[Alice mentioned you: link.com/p/123]The Challenge: Each platform has:
Different character limits
Different rich content support
Different interaction patterns
Different notification APIs
The Solution: Template system with platform-specific renderers
graph LR
A[Base Event Data] --> B[Template Engine]
B --> C[Platform Adapter]
C --> D[iOS Notification]
C --> E[Android Notification]
C --> F[Email Template]
C --> G[Web Notification]
style B fill:#ffffccMonth 3: The Scale Reality
Three months in, your "simple" mention feature has evolved into this:
The Final Architecture
graph TB
subgraph "Frontend Layer"
A[Web App]
B[iOS App]
C[Android App]
end
subgraph "API Gateway Layer"
D[Load Balancer]
E[Rate Limiter]
F[API Gateway]
end
subgraph "Application Services"
G[Mention Parser Service]
H[Search Service]
I[Privacy Service]
J[Notification Service]
end
subgraph "Data Services"
K[Elasticsearch Cluster]
L[Redis Cache Cluster]
M[Kafka Event Stream]
end
subgraph "Worker Layer"
N[Search Index Workers]
O[Notification Workers]
P[Privacy Check Workers]
Q[Delivery Workers]
end
subgraph "Storage Layer"
R[(User Database)]
S[(Notification Database)]
T[(Analytics Database)]
end
subgraph "External Services"
U[Push Notification<br/>FCM/APNS]
V[Email Service<br/>SendGrid]
W[SMS Service<br/>Twilio]
end
A --> D
B --> D
C --> D
D --> E
E --> F
F --> G
F --> H
F --> I
F --> J
H --> K
H --> L
G --> M
I --> L
M --> N
M --> O
M --> P
O --> Q
Q --> U
Q --> V
Q --> W
N --> K
O --> S
P --> R
style M fill:#ffffcc
style L fill:#ccffff
style K fill:#ffffccThe Numbers
After three months of optimization:
Metric | Value |
|---|---|
Mentions per day | 50 million |
Peak mentions per second | 15,000 |
Autocomplete searches per second | 100,000 |
Average autocomplete latency | 35ms |
Notification delivery time (p95) | 800ms |
Privacy checks per second | 200,000 |
Elasticsearch cluster size | 30 nodes |
Redis cache size | 500GB |
Kafka daily events | 500 million |
Infrastructure cost | $45,000/month |
What You've Built
12 distinct systems:
Real-time text parser
Autocomplete search engine
Multi-tier caching layer
Privacy rule engine
Context analyzer
Event streaming pipeline
Notification creation system
Multi-channel delivery system
Deduplication system
Cross-platform rendering
Analytics and monitoring
Abuse prevention system
Your team:
4 backend engineers
1 infrastructure engineer
1 data engineer
1 on-call rotation
All for a feature that "looks simple" when you use it on Facebook.
The Hidden Challenges Nobody Talks About
Challenge 1: Mention Bombing
Attackers discover they can spam-mention popular accounts:
graph TD
A[Attacker creates 1000 bots] --> B[Each bot posts:<br/>@celebrity check this out!]
B --> C[Celebrity gets 1000 notifications]
C --> D[Notifications become useless]
style D fill:#ffccccSolution: Rate limiting + Pattern detection
graph LR
A[Mention created] --> B{Rate limit check}
B --> C{From same user?<br/>Max 10 mentions/hour to same person}
C --> D{From similar accounts?<br/>Detect bot patterns}
D --> E{Content similar?<br/>Detect spam templates}
E --> F{Allow or block}
style F fill:#ffffccChallenge 2: Ghost Mentions
User deletes their post, but mentioned users already got notified:
sequenceDiagram
participant Alice
participant System
participant Bob
Alice->>System: Post with @bob
System->>Bob: Notification sent
Note over Bob: Gets notification<br/>Click to view
Alice->>System: Delete post
Bob->>System: Click notification
System->>Bob: 404 - Post not found
Note over Bob: Confused and frustratedSolution: Cascade deletion
When content is deleted:
Mark notification as "content_deleted"
Show user: "This post was deleted"
Don't just 404
Challenge 3: Edit Wars
User edits their post to add/remove mentions:
graph TD
A[Original post: Check out @john's work] --> B[John gets notification]
C[Edit 1: Check out @sarah's work] --> D[Remove john's mention<br/>Add sarah's mention]
D --> E{What to do with John's notification?}
E --> F[Option 1: Keep it - inconsistent]
E --> G[Option 2: Delete it - confusing]
E --> H[Option 3: Update it - complex]
style F fill:#ffcccc
style G fill:#ffcccc
style H fill:#ffffccMost platforms choose: Once notified, notification stays. Edits don't trigger new notifications.
Challenge 4: International Names
Your regex breaks on:
Names with accents: José, François, Björk
Non-Latin scripts: 张伟, محمد, Владимир
Special characters: O'Brien, Smith-Jones
Single names: Madonna, Cher
Very long names: Adolph Blaine Charles David Earl Frederick Gerald Hubert Irvin John Kenneth Lloyd Martin Nero Oliver Paul Quincy Randolph Sherman Thomas Uncas Victor William Xerxes Yancy Zeus Wolfeschlegelsteinhausenbergerdorff
graph TD
A[Username Detection] --> B{Character set}
B --> C[ASCII only]
C --> D[Breaks for 70% of world]
B --> E[Unicode support]
E --> F[Handles all languages]
B --> G[Whitelist approach]
G --> H[Breaks on edge cases]
style D fill:#ffcccc
style F fill:#90EE90
style H fill:#ffccccSolution: Unicode-aware regex + normalization
Challenge 5: The Database That Never Stops Growing
Every mention creates a notification record. Those records pile up:
graph LR
A[Day 1:<br/>10,000 mentions] --> B[1 year:<br/>3.6 million mentions]
B --> C[5 years:<br/>18 million mentions]
C --> D[Database size:<br/>500GB]
D --> E[Query performance:<br/>Degrading]
style E fill:#ffccccSolution: Data lifecycle management
graph TD
A[New notifications] --> B[Hot storage<br/>PostgreSQL]
B --> C{Age > 30 days?}
C --> |Yes| D[Move to warm storage<br/>Compressed tables]
D --> E{Age > 1 year?}
E --> |Yes| F[Move to cold storage<br/>S3/Archive]
F --> G{Age > 5 years?}
G --> |Yes| H[Delete]
style B fill:#90EE90
style D fill:#ffffcc
style F fill:#ffd699The Uncomfortable Truth
Here's what keeps engineering teams up at night:
Most mention notifications are never read.
graph TD
A[100 million mentions sent daily] --> B[60 million opened]
B --> C[30 million clicked through]
C --> D[10 million actually engaged]
E[40% ignored completely]
F[50% opened but not clicked]
G[66% didn't lead to engagement]
style E fill:#ffcccc
style F fill:#ffffcc
style G fill:#ffd699You've built a massively complex system, and most of its output is ignored.
So why build it?
Because those 10 million engagements per day are what make your platform social. They're the connections that keep users coming back.
The feature isn't valuable because of scale. It's valuable because of impact on the 10% who actually engage.
What We Learned
Lesson 1: Simple Features Don't Exist at Scale
What looks like:
Parse text
Send notification
Actually is:
Real-time search across billions of users
Distributed privacy engine
Multi-channel notification delivery
Cross-platform synchronization
Abuse prevention
Data lifecycle management
Lesson 2: Privacy is the Hardest Part
Not the technical implementation—the business logic.
Every company has different rules about:
Who can mention whom
In what contexts
With what restrictions
With what exceptions
Your privacy engine will have hundreds of edge cases.
Lesson 3: The First 90% Takes 10% of the Time
graph LR
A[Week 1:<br/>Basic mentions working] --> B[Month 1:<br/>Adding autocomplete]
B --> C[Month 2:<br/>Adding privacy]
C --> D[Month 3:<br/>Handling scale]
D --> E[Month 6:<br/>Still fixing edge cases]
style A fill:#90EE90
style E fill:#ffccccThe demo works in a week. Production-ready takes six months.
Lesson 4: Async Everything
The only way to scale is to decouple:
User action from processing
Processing from delivery
Delivery from confirmation
graph TD
A[User action] -.instant response.-> B[User happy]
A --> C[Queue event]
C -.async.-> D[Process later]
D -.async.-> E[Deliver eventually]
style B fill:#90EE90Users don't need instant notifications. They need instant feedback that their action worked.
Lesson 5: Monitor Everything
You can't debug what you can't see:
graph TD
A[Monitoring Needed]
A --> B[Autocomplete latency<br/>p50, p95, p99]
A --> C[Search index lag<br/>How stale is the index?]
A --> D[Privacy check failures<br/>How many blocks?]
A --> E[Notification delivery rate<br/>% delivered successfully]
A --> F[Queue depth<br/>Are we falling behind?]
A --> G[Error rates by type<br/>What's failing?]
A --> H[Cost per mention<br/>Infrastructure efficiency]The Modern Approach: Learning from the Giants
Here's how the big platforms actually do it:
Facebook/Meta's Approach
graph TB
A[Mentions] --> B[TAO Graph Database]
B --> C[Real-time graph queries]
C --> D[Privacy via graph edges]
E[Advantages:]
E --> F[Fast relationship checks]
E --> G[Natural privacy model]
E --> H[Handles 1B+ users]They model everything as a graph. A mention is just an edge. Privacy is checking if an edge exists.
Twitter's Approach
graph TB
A[Mentions] --> B[Fanout on write]
B --> C[Pre-compute timelines]
D[Advantages:]
D --> E[Fast reads]
D --> F[Handles viral content]
G[Disadvantages:]
G --> H[Expensive writes]
G --> I[Celebrity problem]They pre-compute who sees what. Faster reads, but writes are expensive.
LinkedIn's Approach
graph TB
A[Mentions] --> B[Hybrid: Fanout for small audiences]
B --> C[Lazy load for large audiences]
D[Advantages:]
D --> E[Best of both worlds]
F[Disadvantages:]
F --> G[Complex implementation]Different strategies for different content types.
If You're Building This Today
Start with the minimum:
Phase 1: MVP (Week 1-2)
graph LR
A[Basic mention parsing] --> B[Simple notification]
B --> C[Ship it]
style C fill:#90EE90Phase 2: Usability (Month 1)
graph LR
A[Add autocomplete] --> B[Add caching]
B --> C[Optimize search]Phase 3: Privacy (Month 2)
graph LR
A[Add privacy rules] --> B[Add context checking]
B --> C[Add block support]Phase 4: Scale (Month 3+)
graph LR
A[Add async processing] --> B[Add proper queues]
B --> C[Add monitoring]
C --> D[Add abuse prevention]Don't build everything at once. Build what you need, when you need it.
The Future: AI-Powered Mentions
The next evolution is already here:
Smart Mention Suggestions
graph TD
A[User types: Check out this design] --> B[AI analyzes content]
B --> C[Suggest relevant people]
C --> D[Sarah - designer on your team]
C --> E[Mike - requested this feature]
C --> F[Lisa - stakeholder]
style D fill:#90EE90
style E fill:#90EE90
style F fill:#90EE90AI predicts who you SHOULD mention based on:
Content topic
Project involvement
Past interactions
Current context
Intent-Based Notifications
graph TD
A[Mention created] --> B[AI predicts importance]
B --> C[High importance<br/>Urgent work mention]
C --> D[Deliver immediately<br/>Push notification]
B --> E[Medium importance<br/>FYI mention]
E --> F[Deliver soon<br/>Batched email]
B --> G[Low importance<br/>Social mention]
G --> H[Deliver later<br/>Daily digest]Not all mentions are equal. AI learns what matters to each user.
Final Thoughts
When you see @mentions on Facebook, Instagram, Twitter, or LinkedIn, you're seeing the tip of an iceberg.
Below the surface:
Dozens of microservices
Millions of lines of code
Terabytes of cache
Billions of events per day
Hundreds of engineers' years of work
All to make something feel "simple."
That's the magic of good system design. Complexity that disappears into simplicity.
The next time your CEO says "let's add mentions, it's simple," show them this blog post.
Then buckle up. You've got three interesting months ahead.
Loading comments...