Notes on Caching content
Date: 2024-12-17
Category: Misc
In a previous post I explained how to add caching to Django (and in a future post I'll explain how to add it to a webserver like Nginx or Apache). I experimented with caching and did some reading on the matter and I learned a few things.
there are two things hard in computer science: cache invalidation and naming things
-- Phil Karlton
Layers a web requests can be cached
- The browser: We can instruct the browser how long to cache content with Cache-control headers, but we have no control in invalidating
- CDN's: We have limited control (Cache-control headers) and might be able to purge specific/all content when we want to.
- Webserver: Similar to CDN's, controlled by cache-control headers and we are able to purge content if and when we want to.
- Application: We have full control, we can invalidate things and cache only specific parts
So, how long do we need to set our TTL
This all depends on your usecase. I advice to keep it short. It's better to re-render the page every 10 minutes than having trouble invalidating caches somewhere you can't control and having users seeing super out of date content.
For this blog, I decided to set a TTL of 10 minutes for pages/posts on client caches / downstream caches. I do cache pages and posts for an hour on the server, but I added some code that invalidates the entries whenever I add a new post or page. Even if for some reason this invalidation fails, an hour is not that long for a blog.
For the rewrite of Skyz I'm still deciding, but I will probably do something allong the lines of this:
- The homepage/forecast pages will be cached for 30 minutes on the server and 15 minutes on the client. The forecasts don't update that often but I do want users to see pretty recent data. The cache is mostly important on the homepage since that page is opened by everyone that visits the site.
- For PWS pages, I cache both the entire page as some of the more expencive queries. The pages itself will be cached for 5 minutes (most stations push data every 5 minutes).
- Queries for the PWS charts with the data from the last 24 hours) wil be cached for 10 - 15 minutes.
- The historical PWS data for a day older than a week isn't accessed frequently enough to compensate for the memory usage in the cache; it is acceptable to wait a bit longer to render the page than to take up space for the pages that benefit from caching.
CSS/JS assets can be cached for a longer period, for example a week or even a month, but you probably want to use versioned URL's (this is called cachebusting).