Technologies Used in building Twitter Website and Application

Twitter (now X) operates one of the most sophisticated distributed systems in the world. As of 2026, its architecture has shifted toward a heavy focus on AI-driven ranking (via xAI’s Grok), real-time event streaming, and a highly optimized microservices model.

Below is the breakdown of the project structure and technologies powering the platform.

1. Frontend Architecture (Web & Mobile)

The frontend is designed to be a Progressive Web App (PWA), meaning the web experience is built to feel like a native app with high responsiveness.

  • Web Framework: React.js with TypeScript. It utilizes React Server Components (RSC) to reduce client-side bundle sizes.

  • State Management: Traditionally Redux, though they have moved toward more localized state and React Query for server-state synchronization.

  • Styling: A custom utility-first CSS framework (similar to Tailwind) and the Chirp font for brand identity.

  • Mobile (iOS): Swift with a heavy reliance on dynamic frameworks and Swift Package Manager (SPM).

  • Mobile (Android): Kotlin using Jetpack Compose for modern, declarative UI.

  • Package Managers: * Yarn/npm for the web.

    • Swift Package Manager (SPM) (migrated from CocoaPods).

    • Gradle for Android.

2. Backend & Microservices

X transitioned away from its original Monolithic Ruby on Rails architecture years ago to a JVM-based microservices architecture.

  • Primary Languages: Scala (core services) and Java.

  • Performance Framework: Finagle. This is an extensible RPC system for the JVM used to build high-concurrency servers.

  • API Layer: GraphQL and Apache Thrift. Thrift is used for efficient cross-language communication between services.

  • Real-time Processing: Apache Storm and Apache Heron handle the massive influx of tweets (event streams) in real-time.

  • Orchestration: Kubernetes (migrated from Apache Mesos/Aurora) for managing containers across massive data centers.

3. Data Storage & Caching

Handling 500 million+ tweets a day requires a multi-tiered storage strategy.

Layer Technology Use Case
Distributed DB Manhattan A proprietary distributed key-value store (Twitter-built) for core tweet data.
Relational DB MySQL Used in massive clusters for user data and metadata.
NoSQL Cassandra High-availability storage for timelines and certain activity logs.
Caching Redis & Pelikan Pelikan Cache is a custom Twitter framework designed to handle petabytes of cached data.
Search Earlybird A real-time search engine based on Apache Lucene.

4. The Recommendation Engine (Twitter 2.0 / X)

In 2026, the "secret sauce" is the open-source recommendation algorithm.

  • Machine Learning: Python is the dominant language here, using PyTorch and TensorFlow.

  • Algorithm Pipeline:

    1. Candidate Sourcing: Uses SimClusters (community-based clustering) to find relevant tweets.

    2. Ranking: A heavy-duty Neural Network scores ~1,500 candidates in milliseconds.

    3. Filtering: Applies "Heuristics" (e.g., filtering out blocked users or "not interested" content).

  • AI Integration: Grok AI (powered by xAI) is integrated into the backend to summarize trends and enhance search intent.

5. Third-Party Utilities & Infrastructure

  • Observability: Grafana, Prometheus, and Zipkin for distributed tracing.

  • Cloud Providers: A hybrid-cloud approach using AWS and Google Cloud (GCP) alongside private data centers.

  • CI/CD: Bazel (monorepo build tool) and GitHub Actions.

  • Security: Hystrix for fault tolerance and custom rate-limiting services to prevent scraping/DDoS.

Code

To Be Continued...

- Frontend: React, JavaScript, and HTML5
- Backend: PHP, Java, and Python
- Database: MySQL, Memcached, and Cassandra
- Package Manager: npm (Node.js)
- Frameworks: React, GraphQL, and Thrift
- Tools: Jenkins, Git, and Docker

To wrap up our deep dive, let's look at the Real-Time Direct Messaging (DM) system. This is a massive engineering feat because it must feel instantaneous while being battery-efficient on mobile.

💬 The DM Architecture: MQTT vs. WebSockets

Most web apps use WebSockets for real-time chat. However, Instagram (and Facebook Messenger) uses MQTT (Message Queuing Telemetry Transport).

Why MQTT?

  • Battery Efficiency: WebSockets require frequent "keep-alive" pings to stay open, which can drain a phone's battery by keeping the radio active. MQTT is much "quieter" and can stay connected with far fewer pings.

  • Low Bandwidth: MQTT headers are tiny (as small as 2 bytes), making it perfect for users on unstable 3G or 4G networks.

  • The "Iris" Sync System: Instagram uses a system called Iris. Instead of sending a full message object every time, Iris treats the chat history as a sequence of deltas. Your phone simply tells the server, "I have up to sequence #1005," and the server sends only the new changes (sequence #1006 and #1007).

🏗️ The Direct Message Flow

When you send a "Hi" to a friend:

  1. Publish: Your app publishes a message to a Topic (e.g., ig/messages/user_B_id).

  2. The Broker: An MQTT Broker (likely a highly customized version of Mosquitto or a proprietary Meta service) receives the message.

  3. Persistence: The message is simultaneously written to a high-speed Key-Value Store (like Cassandra or RocksDB) so it can be retrieved if the recipient is offline.

  4. Push:

    • If User B is Online: The Broker pushes the message directly to their active MQTT connection.

    • If User B is Offline: The system triggers the Notification Service (using Firebase Cloud Messaging for Android or APNs for iOS) to wake up their phone.

🛡️ Real-Time Presence & "Seen" Receipts

Managing the "Online" green dot for billions of users is a "write-heavy" nightmare.

  • Presence Service: This is often written in Go or C++ for raw speed. It keeps the "last active" timestamp in a massive Redis cluster.

  • Fan-out: When you go online, Instagram doesn't tell everyone. It only "fans out" your status to your closest friends and people you are actively chatting with to save resources.

🎨 Final Summary of the "Instagram Stack"

If you were to build a "Mini-Insta" today, this is the stack you'd mirror:

Layer Technology
Language Python (Django) & Go
Real-time MQTT (for DMs & notifications)
Primary DB PostgreSQL (sharded by User ID)
Feed/Cache Redis & Memcached
Storage AWS S3 / Google Cloud Storage
Search Elasticsearch or Vector DBs (for AI recs)

This architecture is what allows Instagram to remain stable even when a celebrity like Cristiano Ronaldo posts to 600+ million followers simultaneously.

Select Chapter