Cache

[ ]

There are many caches some of which you may not expect:

Client caching, e.g., browser often has its own cache
CDN
Web server caching, e.g., Reverse proxies and caches such as Varnish can serve static and dynamic content directly.
Database caching
Application caching. In-memory caches such as Memcached and Redis are key-value stores between your application and your data storage.

How to cache

Caching at the database query level

Whenever you query the database, hash the query as a key and store the result to the cache. This approach suffers from expiration issues:

Hard to delete a cached result with complex queries
If one piece of data changes such as a table cell, you need to delete all cached queries that might include the changed cell

Caching at the object level

See your data as an object, similar to what you do with your application code. Have your application assemble the dataset from the database into a class instance or a data structure(s):

Remove the object from cache if its underlying data has changed
Allows for asynchronous processing: workers assemble objects by consuming the latest cached object

Suggestions of what to cache:

User sessions
Fully rendered web pages
Activity streams
User graph data

It can be done in client side or application server.

When to update the cache

Cache-aside

Cache-aside is also referred to as lazy loading. Only requested data is cached. The application does the following:

Look for entry in cache, resulting in a cache miss
Load entry from the database
Add entry to cache
Return entry

Disadvantage:

Each cache miss results in three trips, which can cause a noticeable delay.
Data can become stale if it is updated in the database. This issue is mitigated by setting a time-to-live (TTL) which forces an update of the cache entry, or by using write-through.

Write-through

The application uses the cache as the main data store, reading and writing data to it, while the cache is responsible for reading and writing to the database:

Application adds/updates entry in cache
Cache synchronously writes entry to data store Return

Disadvantage

When a new node is created due to failure or scaling, the new node will not cache entries until the entry is updated in the database. Cache-aside in conjunction with write through can mitigate this issue.
Most data written might never be read, which can be minimized with a TTL.

Write-behind (write-back)

Source: Scalability, availability, stability, patterns

In write-behind, the application does the following:

Add/update entry in cache
Asynchronously write entry to the data store, improving write performance

Disadvantage:

There could be data loss if the cache goes down prior to its contents hitting the data store.
It is more complex to implement write-behind than it is to implement cache-aside or write-through.

Refresh-ahead

You can configure the cache to automatically refresh any recently accessed cache entry prior to its expiration.

Refresh-ahead can result in reduced latency vs read-through if the cache can accurately predict which items are likely to be needed in the future.

Written on March 30, 2019