April 20, 2020

what are archival imports?

                April 20, 2020

            April 20, 2020

            I spent this weekend refactoring how Buttondown handles archival imports!
First, what are archival imports?: if you bring your newsletter over from Mailchimp or Tinyletter (or, now, Substack, but we’ll get to that later) I have some janky logic to import your archives.  This was, prior to this weekend, done in the span of an HTTP request, which was, uh, non-ideal: if you had a large (or even a not-so-large) archive, your import would probably time out and/or silently fail, leading to a customer email, leading to me importing your archives manually, which was pretty easy but definitely frictional.
A good rule of thumb I learned a number of years ago is to get the number of “things” in an HTTP request as low as possible: “things” are database calls, sure, and third-party requests, and  file reads, and pretty much anything.  HTTP requests should be granular, instant, and fail very rarely — all the important stuff associated with them should happen after the fact.  My archival imports broke this rule with wild abandon.
Then, on an unrelated note, I finally decided to get my Substack archive importer production-ready.  It existed as a script I could run whenever I wanted, but hadn’t been exposed to end users yet — largely because of:

laziness;
lack of demand.

The latter has been invalidated over the past few weeks (another good rule of thumb: if you have to run the same script more than five times in the span of a week, consider exposing it to self-service), so it was time to work on the former!
To dispel the former, then, I decided to roll it into an overall refactor of how I was handling imports.  Instead of having separate endpoints like POST /api/import-tinyletter-archives and POST /api/import-substack-archives, I now have an omnibus endpoint POST /api/import-archives that takes a username and a source.  All this endpoint does is create a dummy ArchiveImport object to let the system know there’s an import ready to get kicked off: the kicking off and actually pulling in the archives happens asynchronously, and I get to have generic logic for things like success/failure notifications and secondary ID normalization and all of the things I have to deal with regardless of what source I’m pulling the archives from.
Let me be explicit: this is the obvious architecture, and the one that has been built thousands of times across thousands of software applications.  There is nothing novel or interesting about this approach; it is the sensible thing to do that anyone with a modicum of development expertise would do.
But a lot of parts of Buttondown weren’t developed as an outcome of sensible thinking and deliberation, unfortunately.  A lot of detritus is scattered across my sitemap, and it feels good to revisit them and turn them into something slightly more respectable.  
This is work that takes the place of building marketing pages or working on internationalization or fixing validation bugs, which is a bummer; but it is work that should be done, even if the rewards it reaps only come ready to harvest eighteen months from now.  There’s nothing wrong with skipping the right abstraction early on if it meant (and it did) that I could make the experience good (if not great) really quickly; but there always comes the time where I need to rebuild it the right way, and it feels good to do so.

                Don't miss what's next. Subscribe to Weeknotes from Buttondown: