July 25, 2022
A brief love letter to Logtail
I'm keeping this somewhat brief because I want to expand my thoughts into a longer and more cogent blog post, but I finally got myself into a place where I feel good about Buttondown's logging. This was a combination of a few separate things:
- I finished onboarding to structlog, which makes every logline structured (in JSON for production use cases, but not locally since it's very annoying to visually parse JSON)
- I found a log sink that handles my use cases — Logtail, which lets me write SQL against my logs.
I've put off caring about logs for a while, since there aren't that many times where I've thought to myself "boy, I wish I had better logs." Most of Buttondown's shenanigans that require deep introspection (missing subscribers, odd rendering quirks, and so on) can be handled at the database level.
Here's one that couldn't be, though: Buttondown's Redis woes. I complained about this a few weeks ago: Buttondown spends a lot (well, $200/month) on a hosted Redis instance that occasionally runs into extremely spiky data storage limits. I had two goals:
- Downsize the Redis instance, since the vast majority of the time it is dramatically underutilized.
- Diagnose the spooky spikes.
This is now trivial, by logging every time I perform a job in Redis:
logger.info(
"job.finished",
seconds=(end - start),
method=self.func_name,
data_size=len(self._data),
queue=self.origin,
)
And writing an SQL query that aggregates these jobs:
SELECT
message.method as method,
sum(message.data_size) as total_data
FROM $table
GROUP BY method
ORDER BY total_data desc
Just like that, I get a view of what's clogging up the works: I've got three asynchronous jobs whose arguments aren't svelte UUIDs but fully hydrated Django models that are extremely large to serialize into Redis. I rework them, and now the spikes are gone, and I save $100/month on Redis. Yay!
This is not, I cannot emphasize enough, incredibly novel stuff. But there's a surprising dearth on "here's how to do end-to-end logging for Django" content, to the extent that I am probably going to take it upon myself to write a pretty extensive post.
Speaking of technical content marketing
While I'm pretty happy with the increased frequency of my feature blog posts as of late, my technical blog posts have not been faring so well. I spent something like five hours on migrating Buttondown to mypy with around eight hundred lifetime views on the essay, per Fathom. This certainly could be worse, but I haven't worked out yet the napkin math that says spending a full day on a technical blog posts is worth prioritizing over, say, spending an hour on a quick feature page. (This is the part where my fiancé, a marketer, says "no, the whole point is that you do both.")
That being said, I'm not doing any favors with how I try and publicize the posts — which is to say, not at all besides some casual tweets. I'm going to set myself a bit of a formal goal just to drive pageviews here, and to remind myself that the point is to actually drive traffic to the blog and not just say "hey, I wrote something this month".