Scalability

[ ]

This note is based on understanding from system-design-primer and www.lecloud.net

Clones

The first golden rule for scalability: every server contains exactly the same codebase and does not store any user-related data.

you can now create an image file from one of these servers. Use this as a “super-clone” that all your new instances are based upon. Whenever you start a new instance/clone, just do an initial deployment of your latest code and you are ready!

Database

SQL Path

do master-slave replication (read from slaves, write to master)
upgrade your master server by adding RAM, RAM and more RAM.
“sharding”, “denormalization” and “SQL tuning”

NoSQL Path

denormalize right from the beginning and include no more Joins in any database query
stay with MySQL, and use it like a NoSQL database, or you can switch to a better and easier to scale NoSQL database like MongoDB or CouchDB
Joins will now need to be done in your application code

Cache

Always mean in-memory caches like Memcached or Redis. Please never do file-based caching

A cache is a simple key-value store and it should reside as a buffering layer between your application and your data storage.

Whenever you do a query to your database, you store the result dataset in cache. A hashed version of your query is the cache key.
- The main issue is the expiration. It is hard to delete a cached result when you cache a complex query
- When one piece of data changes (for example a table cell) you need to delete all cached queries who may include that table cell.
[Recommended] Let your class assemble a dataset from your database and then store the complete instance of the class or the assembed dataset in the cache. Some ideas of objects to cache:
- user sessions (never use the database!)
- fully rendered blog articles
- activity streams
- user<->friend relationships

Asynchronism

If you do something time-consuming, try to do it always asynchronously. There are two ways:

doing the time-consuming work in advance (e.g., training/updating a model) and serving the finished work with a low request time.
for some time consuming work which does depends on user’s input, the frontend of your website sends a job onto a job queue and immediately signals back to the user: your job is in work, please continue to the browse the page.
- The job queue is constantly checked by a bunch of workers for new jobs.
- The frontend, which constantly checks for new “job is done” - signals, sees that the job was done and informs the user about it.
- RabbitMQ is one of many systems which help to implement async processing.

Two Types of Scale

Horizontal scale: Scaling out using more commodity machines, which is more cost efficient and results in higher availability - Vertical scale: scaling up a single server on more expensive hardware

Disavatanges of horizontal scale:

Scaling horizontally introduces complexity and involves cloning servers
Downstream servers such as caches and databases need to handle more simultaneous connections as upstream servers scale out

Written on March 30, 2019