September 2, 2020
Logging and monitoring misadventures.
Buttondown has been slow and flaky recently. I feel pretty confident that this is due to increasing volume: the ‘odd’ HN hug has become a daily occurrence, and the load coming from Mailgun/SES, while spiky, scales linearly with traffic. This is manifesting in a painful ways: spats of timeouts and dropped connections.
Previously, this has been a problem I’ve scaled my way out of: Buttondown sits on a half-dozen Heroku boxes and then I bump that up to a dozen until things stop complaining. That steady state has shifted, though: instead of going from 6 to 12, I’m finding myself going from 12 to 24, and even then there’s a lot of pain.
This is not uncommon, and there’s a playbook for this:
Want to read the full issue?