Design a Blogging Platform

In this blog, we will study the factors that get affected by any system design decision. We will understand it while we design a blogging platform. We will deep dive into caching issues at scale and how to solve them, async processing, delegation, kafka essentials and different communication paradigms.

Foundational topics in System Design

Database
Caching
Scaling
Delegation
Concurrency
Communication

Any and every decision would affect one of the above factors.

We’ll design a multi-user blogging system (eg: medium.com)

one user multiple blogs
multiple users

Let’s take a look at the key design decisions.

Database

We’ll have two tables: users and blogs. Here’s is the schema for the tables:

users table:

id
name
bio

blogs table:

id
title
body
author_id
published_at
is_deleted

Importance of `is_deleted`: soft delete.

When user invokes delete blog, instead of DELETE we UPDATE is_deleted.

Key reasons: Recoverability, archival, audit.

Also it’s easy on the database engine (no tree re-balancing).

Column Type: body v/s bio

body → long text: stored as a reference (LONGTEXT)

The database will have to make two disk calls, one disk call to get the row and another disk call to go to the reference for long text and get the body.
bio → short text: stored along with other columns (VARCHAR)

The database does this internally if we provide the correct data type for the columns.

Storing datetime in DB: `published_at`

Datetime as datetime

Serialized in some format: 30-11-2024T09:01:36Z

Convenient, sub-optimal, heavy on size and index.
Datetime as epoch integer

Seconds since 1ˢᵗ January 1970: 172562347162

Efficient, optimal, light weight.

Note: Always make sure to capture epoch time in UTC for all regions, so that you can render it in different regions according to their time zones.
Datetime as custom format

Real use case: Redbus shifted from using datetime to int, because they just needed dates, thus reducing the space consumed for date to just 4 Bytes!

YYYYMMDD: 20241130

Caching

Caching is anything that reduces response times by saving any heavy computation.

Note: Cache is not only RAM based.

Typical use: reduce disk I/O or network I/O or compute.

Caches are just glorified hash tables (with some advanced data structures).

Most common use case: Cache (RAM) for application level cache, saves DB computations.

Caching at different levels

Database views (materialised)
Centralised remote cache

A cache stored on a dedicated server that's built on top of a key/value NoSQL store like Redis or Memcached.
Disk of API Server

Local disk I/O is anyway faster than making a network call to the centralised remote cache. It has more storage capacity than having a cache in the main memory (RAM) level of the API storage. Use it if there are infrequent changes in the DB and you are comfortable with inconsistencies in data making it okay for you to serve stale data sometimes.
Main memory (RAM) of API Server
- limited storage
- API server may crash → you also lose the cached data
- inconsistency
Load balancer (API Gateway)
CDN (cache response)
Browser (local storage)

Use case: personalised recommendations

Rankings updated once a day and a bunch of recommendations (50) are stored in the browser cache, so that each time a user loads the page, we pick 10 random recommendations out of the 50 and render it on the page. This way each time the UX is different for the user but we only run the computations for getting the rankings once for the day.

Scaling

Ability to handle large number of concurrent requests. Two scaling strategies:

Vertical Scaling
- Hulk: Make infra bulky, add more CPU, RAM, Disk.
- Easy to manage
- Risk of downtime
- Vertical scaling is limited by the capabilities of the physical hardware on which the system is running.
Horizontal Scaling
- Minions: Add more machines
- Linear amplification
- Fault tolerance
- Complex architecture
- Network partitioning

Good scaling plan: First scale vertically upto a point and then scale horizontally.

Note: No premature optimisations.

For our medium.com, we scale:

vertically first
then horizontally

★ Horizontal scaling ≈ ∞ scaling, but there is a catch!!

Cascaded failures becomes the root cause of all the major outages. To prevent that: ensure your stateful components like DB, cache and other dependent services can handle those many concurrent requests. Hence, whenever you scale, always do it bottom up!

For our medium.com, we scale DB first and then the API server.

Scaling the database

Vertical scaling

Just go to the cloud console and update configurations.
Read Replicas

For our use case, there will always be more people who read the blogs on the platform than the publishers of the blogs → more read requests than writes. Hence, if we can scale reads, then we’ll have successfully scaled the database. To do that we make use of read replicas.

All the write requests go to the master. Then they would be asynchronously replicated on the replicas. For asynchronous replication always remember, the replicas pull the changes from the master.
Sharding

Let’s say there is so much write traffic that the master can’t handle it alone, so in this case we make multiple master nodes based on some id, that store mutually exclusive subsets of data. Each of the master nodes can have their own replicas that handle reads. You need to have a routing layer between the API server and the database that can route the write requests to the correct master node.

Delegation

Let’s add basic analytics to our blog!

What does not need to be done in real time, should not be done in real time.

In the context of a request, do what is essential and delegate the rest.

Core idea: Delegate & respond.

In the profile page, to show the total number of blogs published by any user, we’ll need to add a new column total_blogs in the users table. We’ll be storing it pre-computed to avoid joins and runtime computation.

users table:

id
name
bio
total_blogs

SQS shown above is a broker. A broker is a buffer to keep the tasks and messages.

Common implementations of a broker

Message queue

Example: SQS, RabbitMQ.
Message stream

Example: Kafka, Kinesis.

Kafka essentials

Kafka is a message stream that holds messages (almost forever as per the deletion policy). Internally, kafka has topics (example: ON_PUBLISH). Every topics has ‘n‘ partitions. Message is sent into a topic and depending on the configured hash key it is put into a partition.

Within partition, messages are ordered. No ordering guarantee across partitions.

Limitation of kafka: number of consumers = number of partitions i.e. we can have ‘n’ types of consumers (search, analytics, backend…), but for each type, we can only have as many consumers as we have partitions.

Consumers can issue a commit to kafka after they have read a particular number of messages. This way, if a consumer crashes before commiting, then when the consumer resumes, it starts processing messages from the previous commit. So, kafka guarantees at-least-once delivery semantics.

Concurrency

Concurrency → to get faster execution → threads & multiprocessing.

Concurrency is the ability of a system to execute multiple tasks through simultaneous execution or time-sharing (context switching), sharing resources and managing interactions.

Issues with concurrency:

communication between threads
concurrent use of shared resources

→ database

→ in-memory variables

Handling concurrency:

Locks (optimist & pessimists)
Mutexes and Semaphores
Go lock free (CRDT)

We’ll touch locking in depth in the upcoming blogs.

Concurrency in our blogging platform: Two users clap the same blog, the view count should go up by +2. We protect our data through: transactions & atomic instructions.

Communication

The usual communication

Short Polling

eg: continuously refreshing cricket score, continuously checking if server is ready

Disadvantages:

HTTP overhead
requests and responses

Long Polling

Let’s say client hits the server to create some instance, server sends back the response only after the instance is created i.e. only when data is available.

eg: response only when the ball is bowled.

Connection re-established after timeout and retired.

Short Polling v/s Long Polling:

short polling sends response right away.
long polling sends response only when done, connection kept open for the entire duration.

eg: EC2 provisioning

short polling: gets status every few seconds
long polling: gets response when server is ready

WebSockets (WS)

★ Server can proactively send data to the client.

Advantages:

real time data transfer
low communication overhead

Applications:

real time communication
stock market ticker
live experiences
multi-player games

Server-sent events (SSE)