You're scrolling through Facebook when you see it—someone types "@" and instantly gets a dropdown of friends to mention. Click a name, they get notified. Simple, elegant, instant.

Your company's CEO sees the same thing. "We need this," she says. "Our users should be able to mention each other too."

You think: "Yeah, this looks straightforward. Just detect @username and send a notification, right?"

But here's what you don't see when you use Facebook's mentions:

The real-time search index scanning 3 billion users in under 50ms
The privacy engine checking 47 different permission rules before showing a name
The notification system handling 500 million mentions per day
The spam filter blocking 50 million fake mention attempts
The distributed architecture spanning 15 data centers

What looks simple on the surface is actually one of the most complex features in modern social platforms.

Let me walk you through building this feature—step by step—so you understand why that "simple" @mention costs Facebook millions of dollars a year to run.

Week 1: The Naive Approach

You start coding. The logic seems obvious:

What You Think It Takes

graph TD
    A[User types @john] --> B[Find 'john' in database]
    B --> C[Show john's profile link]
    C --> D[Send notification to john]
    D --> E[Done! Ship it!]
    
    style E fill:#90EE90

Your Mental Model:

Step 1: Parse the text for anything starting with @
Step 2: Look up that username in the users table
Step 3: Create a clickable link
Step 4: Insert a notification: "Alice mentioned you"

You estimate: 2 days of work.

The First Implementation

You build a simple flow:

sequenceDiagram
    participant User
    participant Frontend
    participant API
    participant Database
    
    User->>Frontend: Types: "Hey @john, check this out"
    Frontend->>API: POST /create-post
    API->>API: Scan text for @mentions
    API->>Database: SELECT * FROM users WHERE username = 'john'
    Database-->>API: User found: john (id: 12345)
    API->>Database: INSERT INTO notifications
    API->>Database: INSERT INTO posts (with mention link)
    API-->>Frontend: Success!
    Frontend-->>User: Post published

You deploy it on Friday afternoon. It works! Users love it.

Then Monday happens.

Monday Morning: Reality Hits

Your phone explodes with alerts at 6 AM.

The Problems:

Problem 1: The Autocomplete Problem

Users are complaining: "Why do I have to type the exact username?"

On Facebook, when you type "@", you see suggestions. When you type "@jo", you see everyone whose name starts with "jo". It's instant. It's intuitive.

Your system makes users type the complete, exact username. If someone's username is "john_smith_1990", good luck remembering that.

graph TD
    A[What Users Expect] --> B[Type: @jo]
    B --> C[Instant dropdown with:]
    C --> D[John Smith]
    C --> E[Joanna Lee]
    C --> F[Joseph Wang]
    
    G[What You Built] --> H[Type: @john_smith_1990]
    H --> I[Hope you remember the exact username]
    
    style A fill:#90EE90
    style G fill:#ffcccc

Why This Is Hard:

You need to search users in real-time as they type. Let's break down what "real-time" actually means:

User types "@j" → Show results in under 100ms
User types "@jo" → Update results in under 100ms
User types "@joh" → Update results in under 100ms

Each keystroke triggers a search. If 10,000 users are typing mentions simultaneously, that's potentially 10,000 database queries per second.

Your current database query:

SELECT * FROM users WHERE username LIKE 'jo%'

This full table scan on 10 million users takes 3 seconds. Completely unusable.

Problem 2: The Database Is Melting

Your mentions feature launched. Users love it. They're mentioning each other everywhere—in posts, comments, replies.

Your database CPU: 95%. Response time: 4 seconds. Support tickets: 47.

graph TD
    A[10,000 concurrent users] --> B[All typing mentions]
    B --> C[Each keystroke = DB query]
    C --> D[10,000 queries per second]
    D --> E[Database: On fire 🔥]
    
    E --> F[Everything slows down]
    F --> G[Users can't load pages]
    G --> H[CEO asks: What happened?]
    
    style E fill:#ff6666
    style H fill:#ff6666

What's Happening:

Every "@" keystroke hits your main database. Your users table has:

10 million users
No proper indexes for prefix search
No caching layer
No query optimization

You're doing a full table scan on every keystroke. This doesn't scale beyond 100 concurrent users.

Problem 3: You're Mentioning People Who Shouldn't Be Mentioned

Bug report: "I just got mentioned in a private group I'm not part of."

Another: "Someone mentioned me in a comment, but I blocked them."

Another: "I got mentioned in an internal company post, but I'm a customer, not an employee."

graph TD
    A[Alice mentions @bob] --> B[In a private group]
    C[Bob is NOT in that group] --> D[Bob gets notification anyway]
    
    E[Charlie mentions @david] --> F[But David blocked Charlie]
    G[David gets notification anyway] --> H[Privacy violation]
    
    style D fill:#ffcccc
    style H fill:#ffcccc

The Privacy Nightmare:

You built a feature without considering:

Privacy settings (who can mention whom?)
Block lists (blocked users shouldn't get notifications)
Group membership (can't mention people outside the group)
Account visibility (private accounts, deactivated accounts)
Role-based access (employees vs customers vs partners)

This isn't just a bug. This is a privacy violation. In some jurisdictions, this could be illegal.

Week 2: Rebuilding the Foundation

You realize this needs a proper architecture. Let's solve these problems one by one.

Solution 1: Real-Time Autocomplete Architecture

You need to search millions of users in milliseconds. Here's how.

The New Search Architecture

graph TB
    subgraph "User's Browser"
        A[User types @jo]
    end
    
    subgraph "API Layer"
        B[API Gateway]
        C[Rate Limiter]
    end
    
    subgraph "Search Layer"
        D[Elasticsearch Cluster]
        E[Search Index]
        F[Fuzzy Matching]
    end
    
    subgraph "Cache Layer"
        G[Redis Cache]
        H[Recent Searches]
    end
    
    subgraph "Data Layer"
        I[(User Database)]
    end
    
    A --> B
    B --> C
    C --> G
    G --> |Cache miss| D
    D --> E
    E --> F
    F --> |Index rebuild| I
    
    style D fill:#ffffcc
    style G fill:#ccffff

How It Works:

Step 1: Elasticsearch for Fast Search

Instead of querying your main database, you build a dedicated search index:

graph LR
    A[User Database] --> B[Change Data Capture]
    B --> C[Kafka Stream]
    C --> D[Elasticsearch Indexer]
    D --> E[Search Index]
    
    F[Search Query: @jo] --> E
    E --> G[Results in 20ms]
    
    style E fill:#90EE90
    style G fill:#90EE90

What Gets Indexed:

Username
Display name
Email (for finding coworkers)
Alternate names/nicknames
Search keywords

Why Elasticsearch?

Built for full-text search
Handles prefix matching efficiently
Supports fuzzy matching (handles typos)
Can scale to billions of documents
Returns results in milliseconds

Step 2: Multi-Tier Caching

Even Elasticsearch can't handle 100,000 queries per second from the same popular search terms.

graph TD
    A[User types @john] --> B{Check Browser Cache}
    B --> |Hit| C[Return cached results - 0ms]
    B --> |Miss| D{Check Redis}
    D --> |Hit| E[Return cached results - 2ms]
    D --> |Miss| F{Check Elasticsearch}
    F --> G[Return fresh results - 20ms]
    G --> H[Update caches]
    
    style C fill:#90EE90
    style E fill:#ffffcc
    style G fill:#ffd699

Three Cache Layers:

Browser Cache (0ms latency)
- Stores recent mentions for 5 minutes
- Perfect for when users repeatedly mention the same people
Redis Cache (2ms latency)
- Stores popular search results
- Key: search term, Value: list of matching users
- TTL: 30 seconds
Elasticsearch (20ms latency)
- Full search capability
- Updated in real-time via Kafka

Step 3: Smart Ranking

When you type "@jo", there might be 10,000 matching users. Which ones do you show first?

graph TD
    A[Search: @jo] --> B[10,000 matches found]
    
    B --> C[Ranking Algorithm]
    
    C --> D[Score: Frequency<br/>How often do you mention them?]
    C --> E[Score: Recency<br/>When did you last mention them?]
    C --> F[Score: Relationship<br/>Are they your friend/follower?]
    C --> G[Score: Activity<br/>Are they active users?]
    C --> H[Score: Context<br/>Are they in this group/thread?]
    
    D --> I[Weighted Score]
    E --> I
    F --> I
    G --> I
    H --> I
    
    I --> J[Top 10 Results]
    
    style J fill:#90EE90

Ranking Formula:

Each user gets a score:

Score = (Frequency × 3) + (Recency × 5) + (Relationship × 2) + (Activity × 1) + (Context × 10)

Why Context Weighs Most:

If you're in a group chat, members of that group appear first
If you're commenting on Alice's post, Alice appears first
If you're in a work channel, coworkers appear first

This is why Facebook's mentions feel "smart"—they're predicting who you want to mention.

The Performance Numbers

graph LR
    A[Before: Database Query] --> B[3000ms per search]
    C[After: Elasticsearch + Cache] --> D[20ms per search]
    
    E[150x faster]
    
    style B fill:#ffcccc
    style D fill:#90EE90
    style E fill:#90EE90

But now you have a new problem: keeping the search index in sync with your database.

The Synchronization Challenge

sequenceDiagram
    participant User
    participant Database
    participant CDC
    participant Kafka
    participant Elasticsearch
    
    User->>Database: Update profile name
    Database->>Database: Write to users table
    Database->>CDC: Detect change
    CDC->>Kafka: Publish user.updated event
    Kafka->>Elasticsearch: Update search index
    
    Note over Database,Elasticsearch: Propagation delay: 100-500ms
    
    User->>Elasticsearch: Search for new name
    Elasticsearch-->>User: Old name (not synced yet!)

The Problem: There's a delay between updating your profile and that change appearing in search results. This is called "eventual consistency."

The Trade-off: You could make the update synchronous (wait for Elasticsearch to update before confirming), but then profile updates take 10x longer.

Most platforms choose: Fast updates + eventual consistency > Slow updates + immediate consistency

Users don't notice a 500ms delay in search results. They definitely notice waiting 5 seconds to update their profile.

Solution 2: The Privacy Engine

Now let's tackle the privacy nightmare. Who can mention whom?

The Privacy Rules Matrix

Before showing someone in autocomplete or sending them a notification, you need to check dozens of rules:

graph TD
    A[Can Alice mention Bob?] --> B{Is Bob's account active?}
    B --> |No| C[❌ Don't show Bob]
    B --> |Yes| D{Has Bob blocked Alice?}
    D --> |Yes| C
    D --> |No| E{Has Alice blocked Bob?}
    E --> |Yes| C
    E --> |No| F{Privacy Setting: Who can mention me?}
    
    F --> G{Everyone}
    F --> H{Friends Only}
    F --> I{Nobody}
    
    G --> J{Context Check}
    H --> K{Are they friends?}
    I --> C
    
    K --> |No| C
    K --> |Yes| J
    
    J --> L{Is this a private group?}
    L --> |Yes| M{Is Bob a member?}
    L --> |No| N[✅ Show Bob]
    
    M --> |No| C
    M --> |Yes| N
    
    style C fill:#ffcccc
    style N fill:#90EE90

The Rule Categories:

Account Status Rules
- Is the account active?
- Is it deactivated?
- Is it banned/suspended?
- Is it a deleted account?
Block Rules
- Has either user blocked the other?
- Are they on each other's restricted lists?
Privacy Settings
- "Who can mention me?" (Everyone / Friends / Nobody)
- "Who can see my posts?" (affects visibility)
- "Who can look me up?" (affects search)
Context Rules
- Private group: Must be a member
- Company workspace: Must be an employee
- Direct message: Must have permission to DM
- Age-restricted content: Age verification required
Relationship Rules
- Are they friends/followers?
- Do they follow each other?
- Have they ever interacted?

The Architecture

You can't check these rules on every keystroke. It would be way too slow.

graph TB
    subgraph "Real-time Path"
        A[User types @jo] --> B[Query Elasticsearch]
        B --> C[Get 100 candidate users]
    end
    
    subgraph "Privacy Filter"
        C --> D[Load privacy rules from cache]
        D --> E[Filter candidates in parallel]
        E --> F[Apply account status rules]
        E --> G[Apply block rules]
        E --> H[Apply privacy settings]
        E --> I[Apply context rules]
        
        F --> J[Filtered results]
        G --> J
        H --> J
        I --> J
    end
    
    subgraph "Result"
        J --> K[Return 10 valid users]
    end
    
    style K fill:#90EE90

How to Make This Fast:

Strategy 1: Pre-compute What You Can

graph LR
    A[User updates privacy settings] --> B[Background Job]
    B --> C[Update privacy cache]
    C --> D[Redis: user:123:privacy]
    
    E[Mention search happens] --> F[Read from cache]
    F --> D
    D --> G[Apply rules in memory]
    
    style G fill:#90EE90

Store privacy settings in Redis with keys like:

user:123:mention_privacy → "friends_only"
user:123:blocked_users → [456, 789, 1011]
group:999:members → [123, 456, 789]

Strategy 2: Fail Fast

Check the simplest rules first:

graph TD
    A[100 candidates] --> B[Check account status]
    B --> C[20 accounts inactive ❌]
    
    D[80 remaining] --> E[Check blocks]
    E --> F[5 blocked users ❌]
    
    G[75 remaining] --> H[Check privacy settings]
    H --> I[30 have mentions disabled ❌]
    
    J[45 remaining] --> K[Check context rules]
    K --> L[35 not in this group ❌]
    
    M[10 remaining] --> N[✅ Show these users]
    
    style C fill:#ffcccc
    style F fill:#ffcccc
    style I fill:#ffcccc
    style L fill:#ffcccc
    style N fill:#90EE90

By checking cheap rules first (cached data), you eliminate most candidates before running expensive checks (database queries).

Strategy 3: Batch Check Context Rules

Instead of checking "Is Bob in this group?" for each candidate individually:

graph LR
    A[10 candidate users] --> B[Single query:<br/>Get all group members]
    B --> C[Filter candidates<br/>against member list]
    
    D[Much faster than<br/>10 individual queries]
    
    style C fill:#90EE90

One query to get all group members, then filter in memory. Way faster than 10 separate queries.

The Caching Strategy

graph TB
    subgraph "Hot Cache - Redis"
        A[User privacy settings<br/>TTL: 5 minutes]
        B[Block lists<br/>TTL: 10 minutes]
        C[Group memberships<br/>TTL: 1 minute]
    end
    
    subgraph "Warm Cache - Application Memory"
        D[Recently checked permissions<br/>TTL: 30 seconds]
    end
    
    subgraph "Cold Storage - Database"
        E[Source of truth<br/>Always consistent]
    end
    
    F[Permission Check] --> D
    D --> |Miss| A
    A --> |Miss| E
    
    style D fill:#90EE90
    style A fill:#ffffcc
    style E fill:#ffd699

Cache Invalidation Strategy:

When someone blocks a user or changes privacy settings:

Update database (source of truth)
Invalidate cache immediately
Next request fetches fresh data

This means there might be a 1-30 second window where stale data exists. For privacy, that's acceptable—the worst case is someone appears in autocomplete briefly but the notification won't send (because that checks fresh data).

Solution 3: The Notification System

Now the hardest part: actually sending notifications when someone gets mentioned.

The Scale Problem

Let's say you're building Twitter-scale mentions:

500 million mentions per day
That's 5,787 mentions per second
Peak hours: 15,000 mentions per second

You can't just INSERT INTO notifications for each mention. Your database would collapse.

The Architecture Evolution

Phase 1: Synchronous (Doesn't Scale)

sequenceDiagram
    participant User
    participant API
    participant Database
    
    User->>API: Post with @john @sarah @mike
    API->>Database: Create post
    API->>Database: INSERT notification for john
    API->>Database: INSERT notification for sarah
    API->>Database: INSERT notification for mike
    API-->>User: Success (took 300ms)

Problems:

Every mention = database write
User waits for all notifications to be created
What if one notification fails? Roll back the post?
Database can't handle 15,000 writes per second

Phase 2: Async Queue (Better)

sequenceDiagram
    participant User
    participant API
    participant Queue
    participant Worker
    participant Database
    
    User->>API: Post with @john @sarah @mike
    API->>Database: Create post
    API->>Queue: Publish mention event
    API-->>User: Success (took 50ms)
    
    Note over Queue: Events queued
    
    Worker->>Queue: Pull events
    Worker->>Database: Batch create notifications

Benefits:

User gets instant response
Failures don't affect user experience
Can batch notifications for efficiency
Can scale workers independently

Phase 3: The Real Architecture

graph TB
    subgraph "Ingestion Layer"
        A[API Servers] --> B[Kafka Topic:<br/>mention.created]
    end
    
    subgraph "Processing Layer"
        B --> C[Validation Worker]
        C --> D[Privacy Check Worker]
        D --> E[Notification Worker]
    end
    
    subgraph "Delivery Layer"
        E --> F[Notification Service]
        F --> G[Push Notification]
        F --> H[Email]
        F --> I[In-App Bell]
        F --> J[SMS - for VIPs]
    end
    
    subgraph "Storage Layer"
        E --> K[(Notification DB)]
        K --> L[Read API]
    end
    
    style F fill:#ffffcc

The Event Flow

Step 1: Mention Detection

When a post is created, extract all mentions:

graph LR
    A[Text: Hey @john and @sarah,<br/>check out @mike's work] --> B[Regex Parser]
    B --> C[Found: john, sarah, mike]
    C --> D[Validate usernames exist]
    D --> E[Publish event for each valid mention]
    
    style E fill:#90EE90

Event Structure:

{
  "event_id": "evt_123",
  "type": "mention.created",
  "timestamp": "2024-11-20T10:30:00Z",
  "data": {
    "mentioned_user_id": "12345",
    "mentioned_by_user_id": "67890",
    "content_type": "post",
    "content_id": "post_999",
    "context": {
      "group_id": "grp_111",
      "thread_id": null
    }
  }
}

Step 2: Privacy Validation

Before sending notification, validate permissions:

graph TD
    A[Mention Event] --> B{Privacy Check}
    
    B --> C{User active?}
    C --> |No| D[❌ Drop event]
    C --> |Yes| E{Blocked?}
    E --> |Yes| D
    E --> |No| F{Has permission?}
    F --> |No| D
    F --> |Yes| G{Context valid?}
    G --> |No| D
    G --> |Yes| H[✅ Create notification]
    
    style D fill:#ffcccc
    style H fill:#90EE90

This is where 40% of mention events get filtered out.

Step 3: Notification Creation

sequenceDiagram
    participant Worker
    participant NotificationDB
    participant UserPrefsCache
    participant DeliveryQueue
    
    Worker->>UserPrefsCache: Get notification preferences
    UserPrefsCache-->>Worker: Email: ON, Push: ON, SMS: OFF
    
    Worker->>NotificationDB: Create notification record
    Worker->>DeliveryQueue: Queue email delivery
    Worker->>DeliveryQueue: Queue push notification
    
    Note over DeliveryQueue: Separate workers handle<br/>each delivery channel

Step 4: Delivery

Different channels have different priorities:

graph TD
    A[Notification Created] --> B{Delivery Channels}
    
    B --> C[In-App<br/>Immediate]
    B --> D[Push Notification<br/>Within 1 second]
    B --> E[Email<br/>Within 1 minute]
    B --> F[SMS<br/>Only for VIPs]
    
    C --> G[WebSocket to connected clients]
    D --> H[FCM/APNS]
    E --> I[Email queue - batched]
    F --> J[Twilio API]
    
    style C fill:#90EE90
    style D fill:#ffffcc
    style E fill:#ffd699
    style F fill:#ffcccc

Handling the Celebrity Problem

What happens when @elonmusk tweets "Thanks @everyone who attended"?

If a user with 100 million followers mentions someone popular, you might need to fanout that notification to millions of people.

graph TD
    A[@celebrity mentions @popular_person] --> B[100M followers need notification]
    
    B --> C{Naive Approach}
    C --> D[Create 100M notification records]
    D --> E[💥 Database explodes]
    
    B --> F{Smart Approach}
    F --> G[Create 1 notification template]
    G --> H[Lazy loading on read]
    H --> I[Generate notification on demand]
    
    style E fill:#ffcccc
    style I fill:#90EE90

The Solution: Lazy Materialization

Instead of creating 100M records:

Store one "event" record
When user opens their notifications, check: "Are there any events for me?"
Generate notification UI on the fly
Mark as "seen" per user in a separate table

This is why Twitter can handle viral tweets without melting their infrastructure.

Deduplication

Users hate duplicate notifications. But with distributed systems, duplicates happen:

sequenceDiagram
    participant Worker1
    participant Worker2
    participant Database
    
    Note over Worker1,Worker2: Both workers process same event<br/>(network retry caused duplicate)
    
    Worker1->>Database: Create notification for user 123
    Worker2->>Database: Create notification for user 123
    
    Note over Database: Now user has 2 identical notifications!

Solution: Idempotency Keys

graph LR
    A[Mention Event ID:<br/>evt_abc123] --> B[Generate idempotency key:<br/>hash event_id + user_id]
    B --> C[Try to INSERT with unique key]
    C --> D{Already exists?}
    D --> |Yes| E[Skip - already processed]
    D --> |No| F[Insert notification]
    
    style E fill:#ffffcc
    style F fill:#90EE90

The database enforces uniqueness on the idempotency key, so duplicate processing is automatically prevented.

Solution 4: The Cross-Platform Challenge

Here's where it gets really complex. Mentions need to work everywhere:

graph TD
    A[Mentions Must Work In:] --> B[Posts]
    A --> C[Comments]
    A --> D[Replies to comments]
    A --> E[Direct Messages]
    A --> F[Group Chats]
    A --> G[Live Streams]
    A --> H[Stories]
    A --> I[Video Descriptions]
    A --> J[Photo Captions]
    
    B --> K[Each has different:<br/>- Privacy rules<br/>- Context<br/>- Notification style]
    C --> K
    D --> K
    E --> K
    F --> K

The Context Problem

The same mention has different meanings in different contexts:

graph TD
    A[@john mentioned in...] --> B{Context}
    
    B --> C[Public Post]
    C --> D[Anyone can mention anyone]
    
    B --> E[Private Group]
    E --> F[Only group members can mention each other]
    
    B --> G[Direct Message]
    G --> H[Must have DM permission]
    
    B --> I[Company Workspace]
    I --> J[Only employees can mention employees]
    
    B --> K[Comment Thread]
    K --> L[Inherit permissions from parent post]
    
    style D fill:#90EE90
    style F fill:#ffffcc
    style H fill:#ffffcc
    style J fill:#ffffcc
    style L fill:#ffd699

The Unified Architecture

Instead of building mention logic separately for each feature, you need a unified system:

graph TB
    subgraph "Content Sources"
        A[Posts API]
        B[Comments API]
        C[Messages API]
        D[Stories API]
    end
    
    subgraph "Mention Service - Core"
        E[Mention Parser]
        F[Context Analyzer]
        G[Privacy Engine]
        H[Notification Engine]
    end
    
    subgraph "Support Services"
        I[User Search Service]
        J[Permission Service]
        K[Notification Service]
    end
    
    A --> E
    B --> E
    C --> E
    D --> E
    
    E --> F
    F --> G
    G --> H
    
    E --> I
    G --> J
    H --> K

The Interface Contract:

Every content type that wants to support mentions must provide:

Content context (where is this? public/private? group ID?)
Author ID (who is mentioning?)
Mentioned user IDs (who got mentioned?)
Permission callback (how to check if mention is allowed in this context?)

This way, the core mention logic stays the same, but context-specific rules are pluggable.

Cross-Platform Notification Rendering

The same mention notification looks different depending on where you see it:

graph TD
    A[Notification:<br/>Alice mentioned you] --> B{Platform}
    
    B --> C[Mobile App]
    C --> D[Push: Alice mentioned you in Gaming Group]
    
    B --> E[Email]
    E --> F[Subject: Alice mentioned you<br/>Body: Full context + reply button]
    
    B --> G[Web]
    G --> H[Bell icon counter<br/>+ dropdown with preview]
    
    B --> I[Desktop App]
    I --> J[Native notification banner]
    
    B --> K[SMS - VIP only]
    K --> L[Alice mentioned you: link.com/p/123]

The Challenge: Each platform has:

Different character limits
Different rich content support
Different interaction patterns
Different notification APIs

The Solution: Template system with platform-specific renderers

graph LR
    A[Base Event Data] --> B[Template Engine]
    B --> C[Platform Adapter]
    C --> D[iOS Notification]
    C --> E[Android Notification]
    C --> F[Email Template]
    C --> G[Web Notification]
    
    style B fill:#ffffcc

Month 3: The Scale Reality

Three months in, your "simple" mention feature has evolved into this:

The Final Architecture

graph TB
    subgraph "Frontend Layer"
        A[Web App]
        B[iOS App]
        C[Android App]
    end
    
    subgraph "API Gateway Layer"
        D[Load Balancer]
        E[Rate Limiter]
        F[API Gateway]
    end
    
    subgraph "Application Services"
        G[Mention Parser Service]
        H[Search Service]
        I[Privacy Service]
        J[Notification Service]
    end
    
    subgraph "Data Services"
        K[Elasticsearch Cluster]
        L[Redis Cache Cluster]
        M[Kafka Event Stream]
    end
    
    subgraph "Worker Layer"
        N[Search Index Workers]
        O[Notification Workers]
        P[Privacy Check Workers]
        Q[Delivery Workers]
    end
    
    subgraph "Storage Layer"
        R[(User Database)]
        S[(Notification Database)]
        T[(Analytics Database)]
    end
    
    subgraph "External Services"
        U[Push Notification<br/>FCM/APNS]
        V[Email Service<br/>SendGrid]
        W[SMS Service<br/>Twilio]
    end
    
    A --> D
    B --> D
    C --> D
    
    D --> E
    E --> F
    
    F --> G
    F --> H
    F --> I
    F --> J
    
    H --> K
    H --> L
    G --> M
    I --> L
    
    M --> N
    M --> O
    M --> P
    
    O --> Q
    
    Q --> U
    Q --> V
    Q --> W
    
    N --> K
    O --> S
    P --> R
    
    style M fill:#ffffcc
    style L fill:#ccffff
    style K fill:#ffffcc

The Numbers

After three months of optimization:

Metric	Value
Mentions per day	50 million
Peak mentions per second	15,000
Autocomplete searches per second	100,000
Average autocomplete latency	35ms
Notification delivery time (p95)	800ms
Privacy checks per second	200,000
Elasticsearch cluster size	30 nodes
Redis cache size	500GB
Kafka daily events	500 million
Infrastructure cost	$45,000/month

What You've Built

12 distinct systems:

Real-time text parser
Autocomplete search engine
Multi-tier caching layer
Privacy rule engine
Context analyzer
Event streaming pipeline
Notification creation system
Multi-channel delivery system
Deduplication system
Cross-platform rendering
Analytics and monitoring
Abuse prevention system

Your team:

4 backend engineers
1 infrastructure engineer
1 data engineer
1 on-call rotation

All for a feature that "looks simple" when you use it on Facebook.

The Hidden Challenges Nobody Talks About

Challenge 1: Mention Bombing

Attackers discover they can spam-mention popular accounts:

graph TD
    A[Attacker creates 1000 bots] --> B[Each bot posts:<br/>@celebrity check this out!]
    B --> C[Celebrity gets 1000 notifications]
    C --> D[Notifications become useless]
    
    style D fill:#ffcccc

Solution: Rate limiting + Pattern detection

graph LR
    A[Mention created] --> B{Rate limit check}
    B --> C{From same user?<br/>Max 10 mentions/hour to same person}
    C --> D{From similar accounts?<br/>Detect bot patterns}
    D --> E{Content similar?<br/>Detect spam templates}
    E --> F{Allow or block}
    
    style F fill:#ffffcc

Challenge 2: Ghost Mentions

User deletes their post, but mentioned users already got notified:

sequenceDiagram
    participant Alice
    participant System
    participant Bob
    
    Alice->>System: Post with @bob
    System->>Bob: Notification sent
    
    Note over Bob: Gets notification<br/>Click to view
    
    Alice->>System: Delete post
    Bob->>System: Click notification
    System->>Bob: 404 - Post not found
    
    Note over Bob: Confused and frustrated

Solution: Cascade deletion

When content is deleted:

Mark notification as "content_deleted"
Show user: "This post was deleted"
Don't just 404

Challenge 3: Edit Wars

User edits their post to add/remove mentions:

graph TD
    A[Original post: Check out @john's work] --> B[John gets notification]
    
    C[Edit 1: Check out @sarah's work] --> D[Remove john's mention<br/>Add sarah's mention]
    
    D --> E{What to do with John's notification?}
    E --> F[Option 1: Keep it - inconsistent]
    E --> G[Option 2: Delete it - confusing]
    E --> H[Option 3: Update it - complex]
    
    style F fill:#ffcccc
    style G fill:#ffcccc
    style H fill:#ffffcc

Most platforms choose: Once notified, notification stays. Edits don't trigger new notifications.

Challenge 4: International Names

Your regex breaks on:

Names with accents: José, François, Björk
Non-Latin scripts: 张伟, محمد, Владимир
Special characters: O'Brien, Smith-Jones
Single names: Madonna, Cher
Very long names: Adolph Blaine Charles David Earl Frederick Gerald Hubert Irvin John Kenneth Lloyd Martin Nero Oliver Paul Quincy Randolph Sherman Thomas Uncas Victor William Xerxes Yancy Zeus Wolfeschlegelsteinhausenbergerdorff

graph TD
    A[Username Detection] --> B{Character set}
    
    B --> C[ASCII only]
    C --> D[Breaks for 70% of world]
    
    B --> E[Unicode support]
    E --> F[Handles all languages]
    
    B --> G[Whitelist approach]
    G --> H[Breaks on edge cases]
    
    style D fill:#ffcccc
    style F fill:#90EE90
    style H fill:#ffcccc

Solution: Unicode-aware regex + normalization

Challenge 5: The Database That Never Stops Growing

Every mention creates a notification record. Those records pile up:

graph LR
    A[Day 1:<br/>10,000 mentions] --> B[1 year:<br/>3.6 million mentions]
    B --> C[5 years:<br/>18 million mentions]
    C --> D[Database size:<br/>500GB]
    D --> E[Query performance:<br/>Degrading]
    
    style E fill:#ffcccc

Solution: Data lifecycle management

graph TD
    A[New notifications] --> B[Hot storage<br/>PostgreSQL]
    B --> C{Age > 30 days?}
    C --> |Yes| D[Move to warm storage<br/>Compressed tables]
    D --> E{Age > 1 year?}
    E --> |Yes| F[Move to cold storage<br/>S3/Archive]
    F --> G{Age > 5 years?}
    G --> |Yes| H[Delete]
    
    style B fill:#90EE90
    style D fill:#ffffcc
    style F fill:#ffd699

The Uncomfortable Truth

Here's what keeps engineering teams up at night:

Most mention notifications are never read.

graph TD
    A[100 million mentions sent daily] --> B[60 million opened]
    B --> C[30 million clicked through]
    C --> D[10 million actually engaged]
    
    E[40% ignored completely]
    F[50% opened but not clicked]
    G[66% didn't lead to engagement]
    
    style E fill:#ffcccc
    style F fill:#ffffcc
    style G fill:#ffd699

You've built a massively complex system, and most of its output is ignored.

So why build it?

Because those 10 million engagements per day are what make your platform social. They're the connections that keep users coming back.

The feature isn't valuable because of scale. It's valuable because of impact on the 10% who actually engage.

What We Learned

Lesson 1: Simple Features Don't Exist at Scale

What looks like:

Parse text
Send notification

Actually is:

Real-time search across billions of users
Distributed privacy engine
Multi-channel notification delivery
Cross-platform synchronization
Abuse prevention
Data lifecycle management

Lesson 2: Privacy is the Hardest Part

Not the technical implementation—the business logic.

Every company has different rules about:

Who can mention whom
In what contexts
With what restrictions
With what exceptions

Your privacy engine will have hundreds of edge cases.

Lesson 3: The First 90% Takes 10% of the Time

graph LR
    A[Week 1:<br/>Basic mentions working] --> B[Month 1:<br/>Adding autocomplete]
    B --> C[Month 2:<br/>Adding privacy]
    C --> D[Month 3:<br/>Handling scale]
    D --> E[Month 6:<br/>Still fixing edge cases]
    
    style A fill:#90EE90
    style E fill:#ffcccc

The demo works in a week. Production-ready takes six months.

Lesson 4: Async Everything

The only way to scale is to decouple:

User action from processing
Processing from delivery
Delivery from confirmation

graph TD
    A[User action] -.instant response.-> B[User happy]
    A --> C[Queue event]
    C -.async.-> D[Process later]
    D -.async.-> E[Deliver eventually]
    
    style B fill:#90EE90

Users don't need instant notifications. They need instant feedback that their action worked.

Lesson 5: Monitor Everything

You can't debug what you can't see:

graph TD
    A[Monitoring Needed]
    
    A --> B[Autocomplete latency<br/>p50, p95, p99]
    A --> C[Search index lag<br/>How stale is the index?]
    A --> D[Privacy check failures<br/>How many blocks?]
    A --> E[Notification delivery rate<br/>% delivered successfully]
    A --> F[Queue depth<br/>Are we falling behind?]
    A --> G[Error rates by type<br/>What's failing?]
    A --> H[Cost per mention<br/>Infrastructure efficiency]

The Modern Approach: Learning from the Giants

Here's how the big platforms actually do it:

Facebook/Meta's Approach

graph TB
    A[Mentions] --> B[TAO Graph Database]
    B --> C[Real-time graph queries]
    C --> D[Privacy via graph edges]
    
    E[Advantages:]
    E --> F[Fast relationship checks]
    E --> G[Natural privacy model]
    E --> H[Handles 1B+ users]

They model everything as a graph. A mention is just an edge. Privacy is checking if an edge exists.

Twitter's Approach

graph TB
    A[Mentions] --> B[Fanout on write]
    B --> C[Pre-compute timelines]
    
    D[Advantages:]
    D --> E[Fast reads]
    D --> F[Handles viral content]
    
    G[Disadvantages:]
    G --> H[Expensive writes]
    G --> I[Celebrity problem]

They pre-compute who sees what. Faster reads, but writes are expensive.

LinkedIn's Approach

graph TB
    A[Mentions] --> B[Hybrid: Fanout for small audiences]
    B --> C[Lazy load for large audiences]
    
    D[Advantages:]
    D --> E[Best of both worlds]
    
    F[Disadvantages:]
    F --> G[Complex implementation]

Different strategies for different content types.

If You're Building This Today

Start with the minimum:

Phase 1: MVP (Week 1-2)

graph LR
    A[Basic mention parsing] --> B[Simple notification]
    B --> C[Ship it]
    
    style C fill:#90EE90

Phase 2: Usability (Month 1)

graph LR
    A[Add autocomplete] --> B[Add caching]
    B --> C[Optimize search]

Phase 3: Privacy (Month 2)

graph LR
    A[Add privacy rules] --> B[Add context checking]
    B --> C[Add block support]

Phase 4: Scale (Month 3+)

graph LR
    A[Add async processing] --> B[Add proper queues]
    B --> C[Add monitoring]
    C --> D[Add abuse prevention]

Don't build everything at once. Build what you need, when you need it.

The Future: AI-Powered Mentions

The next evolution is already here:

Smart Mention Suggestions

graph TD
    A[User types: Check out this design] --> B[AI analyzes content]
    B --> C[Suggest relevant people]
    C --> D[Sarah - designer on your team]
    C --> E[Mike - requested this feature]
    C --> F[Lisa - stakeholder]
    
    style D fill:#90EE90
    style E fill:#90EE90
    style F fill:#90EE90

AI predicts who you SHOULD mention based on:

Content topic
Project involvement
Past interactions
Current context

Intent-Based Notifications

graph TD
    A[Mention created] --> B[AI predicts importance]
    
    B --> C[High importance<br/>Urgent work mention]
    C --> D[Deliver immediately<br/>Push notification]
    
    B --> E[Medium importance<br/>FYI mention]
    E --> F[Deliver soon<br/>Batched email]
    
    B --> G[Low importance<br/>Social mention]
    G --> H[Deliver later<br/>Daily digest]

Not all mentions are equal. AI learns what matters to each user.

Final Thoughts

When you see @mentions on Facebook, Instagram, Twitter, or LinkedIn, you're seeing the tip of an iceberg.

Below the surface:

Dozens of microservices
Millions of lines of code
Terabytes of cache
Billions of events per day
Hundreds of engineers' years of work

All to make something feel "simple."

That's the magic of good system design. Complexity that disappears into simplicity.

The next time your CEO says "let's add mentions, it's simple," show them this blog post.

Then buckle up. You've got three interesting months ahead.