Vincent's Weblog

Scaling Skyz for 2K+ users on a budget

I maintain a weather platform called Skyz as a hobby, which means funds are pretty limited. It has grown quite a bit over the allmost 6 years I've been working on it. I never expected it to grow as much as it did, but I'm happy it did. It's a fun project to work on, and I've learned a lot from it, so in this post I'll explain how I deal with scaling it on a budget.

The platform

Skyz is mostly a weather dashboard, allowing users to view the forecasts for their location of choice in a clear, customizable way. It also has a few other features, such as a weather map, a weather station map, and a few other things. I try to source as much data from the source as possible, since this is part of the fun for me. I fetch forecast data directly from various weather models, Weather station data is fetched from the users directly, lightning data is fetched from a network of about 30 lightning detectors housed at friends' houses, and so on.

These features mean that I can't just throw it on a shared hosting plan and call it a day, I not only need more power than the limited shared hosting plans offer, but I also need more control over the server since I also process a lot of data for faster and more efficient retrieval, and I need to be able to install custom software to fetch the data I need.

Architecture

The platform is split into a few parts: Webservers and compute servers. The webservers are responsible for serving the website and api, while the compute servers are responsible for fetching and processing the data.

The webservers are pretty lightweight, on a calm day one webserver can handle the load, but when there is an active storm, or a storm forecasted, the load can spike and I can spin up a second or even third webserver. These are hosted at hetzner and when I spin up a second/third instance, I use a loadbalancer to distribute the load.

Each webserver has 8cores and 16GB of ram.

The compute servers are more complex, they are responsible for fetching and processing the data. I currently have 2 such systems:

  • An intel NUC with 4c/8t, 32GB of ram and 512GB+1TB of storage space running at home, as a sort of backup/staging area. This server processes a more limited set of data, so that in case the primary compute server fails, This one will keep the platform running, allbeit with slightly less accurate data.

  • A large VPS sponsored by Pixelhosting, with 8vcpu, 32GB of ram and 200GB of storage space. This server is responsible for processing the bulk of forecast data. After the data is processed, some statistical analysis is run combining model data and observations to provide a more accurate forecast (picking the most accurate model for the current conditions, for example).

Costs

Thanks to the sponsorship, I have pretty low costs. The NUC is my old desktop system and unless there is a spike in visitors, a single 8c/16GB ram webserver is enough for most days. I could probably get by with a smaller webserver if I spend some time optimizing the website code (improving caching etc), but I'm happy with the current setup.

All in all I pay about 20 euro's a month to keep this project running, which is pretty good for a project that has over 2k users a day.

I have a ko-fi page in case people want to help support the project, but it's listed on the bottom of the website and I don't actively promote it, since I don't like to beg for money.

Conclusion

I'm happy with the current setup, it's a fun project to work on, and I'm happy that I can keep it running on a budget. I'm always looking for ways to improve the platform, so if you have any suggestions, feel free to reach out to me.