Cache the web

Posted on February 25, 2022
Drawn Person with flowting papers

The cache is a storage where webpages can keep resources like stylesheets, script files, or other kinds of data for a second faster page load. Making it more efficient and convenient for the users because of the faster response.

Default Cache

The browser has a default setting of caching files that it thinks are likely not to change, based on the Last-Modified header. Files are kept for an extra 10% of its current lifetime, from the request time. This means that files that were created on the server a month ago (30 days) will be cached on the client for an additional 3 days (10% of 30).

But this behavior can cause a mismatch in the files you have, and yesterday's release could be served with the styles from last month. If you are a developer you don’t have that problem, because you probably have a disabled cache in your dev tools, so you will get a fresh version every time you load the page.

But being a general user, you don’t do that. So you will have the problem of a “stale cache”. This problem can be fixed by using the cache-control headers from your server. Telling the browser what files to cache and when to look for new ones.

Cache Control

The cache-control HTTP header holds instructions for both requests and responses, that control the caching in the browser and shared caches.

Cache-control headers have directives separated by commas and depending on what strategy you wanna use, there are multiple options for giving the most recent and optimized experience.

The response you get back from a server has a freshness lifetime, that decides when a resource goes stale. As mentioned previously the browser uses the Last-Modified header to cache files e.g. freshness lifetime. But the freshness lifetime can also be set from the cache-control: max-age directive or the Expires Header. If none of these are set on the response the browser will calculate the freshness from the Date header and the Last-modified header.

Freshness Lifetime = (Date - Last-modifed) / 10

And then calculating the expiration time, meaning a date into the future where the cache will be stale.

Expiration Time = Response time + Freshness Lifetime - Current age

Using the cache-control header you can tell the browser how long time to cache the content and what to do when the cache goes stale.

  • Setting the max-age directive tells the browser when the response has become “old” after x seconds. That means that the browser asks the server for a fresh version with the next request. Similar to the Age header that stores the time cached in the proxy cache.
  • The no-store directive tells the browser not to cache the content, making the user fetch a new and fresh request every time.
  • The no-cache directive caches the content but asks the server to revalidate each request.
  • The must-revalidate directive caches the response as long as you still have a fresh request and validates again when the response goes stale.
  • The immutable directive tells the browser that “this” response is not going to change and that you don’t need to ask for validation before the response goes stale.

A modern practice is to use this with the cache-busting pattern, where you assign a version number or hash at the end of the file name, changing the URL of the resource. Thus updating the file. Using an ETag is a similar method of tagging your resources with a hash value, checking if the file is still valid with the server.

Space for your cache

Browsers have different implementations and allowances of how much you can store on a user's device. But the amount is typically based on the available space on the device.

  • Chrome and chromium allow the browser to use up to 80% of total disk space And up to 60% of the total disk space per origin.
  • Firefox allows the browser to use up to 50% of free disk space and use up to 2GB per origin.
  • Safari allows for 1GB, but will prompt the users for more storage and increase the limit in 200MB increments.

Storage on the web is split into two categories, “Best Effort” and “Persistent”. Best effort means that storage can be cleaned by the browser when needed, without interrupting the user, which makes it inconvenient to store long-term data. Persistent storage is cleaned manually by the users in the browser settings.

Cache and site data fall into the category of best effort, meaning over time you may lose the cached files because of needed space on the device. Unless you have requested persistent storage

  • Chromium-based and Firefox browsers will begin to evict data when the browser runs out of space, cleaning all site data from least recently used origins until the browser is below the limit again.
  • Safari has implemented a seven-day cap on all writable storage.

What files to cache

When figuring out what files to cache it's a good idea to consider what files are important for the user experience. You don't have an infinite amount of storage, so it is only wise to pick the most important parts. It is not relevant to cache files that the user never sees again. Instead, you should cache files that create the core experience.

You might also consider things like website traffic, server response time, and request rate. Theis metrics are often considered with serverside caching.

Making the core experience online is one thing, but making it offline is another. Offline you don't have access to the internet, so you can't update things in the DOM asynchronously. Caching your app makes it possible to access it offline since you already have the files locally and it also makes a more seamless experience for users with a bad connection.

Accessing files offline is an enhancement of the online experience and makes your application more reliable. Progressively enhancing the capabilities of your application. Progressively enhancements are the core feature of a PWA. With caching and a service worker (currently only a requirement for chrome and android) you can increase the capabilities of your application and make it installable for your desktop and Mobil.

Conclusion

The cache can be bothersome if you don’t control it. So you should think about what things are important for the user's core experience and set headers that fit the resources. Controlling the cache means that you have more control over the users' view since a stale cache can be a disappointing experience.

Making your app offline accessible can be a benefit to your users that has a bad ethernet connection. It makes the app more reliable and can be enhanced to be installable with the PWA feature.

References

Get in touch

Lasse_aakjaer@hotmail.com