How to Overengineer Your Blog

This is going to be a technical look at how I built this site and blog. Effectively it is the byproduct of random evenings spent exploring ideas related to what I’m tackling at work. It just so happens that my work revolves around static-heavy site generation and content management. Is my own blog not a nice sandbox to explore methods for approaching this problem in my work?

The starting problem was this: at work, there are a ton of repos containing Markdown files that we stitch together to build a docs site. These md files may contain user guides for the products in that repo, or together comprise an API reference, or some crazy combination of reference, explanation, guide, all combined into a single doc.

Over time, the docs site required more and more patchwork to use existing tooling while also rendering the docs in sensible, friendly ways. We were using mkdocs controlled by an internal solution written in Erlang that listened for changes to upstream sources and rebuilt the site accordingly.

One of the other problems I saw in my org was that certain non-technical product owners who wanted to contribute useful documentation to docs.2600hz, but had difficulty adopting a Git workflow, were stashing documentation in Confluence and then relying on engineers to move and update that content into an appropriate Git repo.

Around this time I had checked out a couple YouTube videos on Astro, which has piqued my interest on several levels. At first, I was interested in making a personal blog with it, but when I understood a bit more of the feature set, I thought “OMG, maybe this is the next-gen solution I’m looking for.”

Some of the initial points that I found interesting when compared to the already-popular NextJS were:

Built in content collection functionality
Easy to integrate and programmatically utilize frontmatter to organize and render content
Toolchain with included packages like remark, rehype, make any additional required content transformation at build time easy

So, with the idea that I wanted to expand the upstream content to include not only 2600Hz Git repos, but possibly some other to-be-determined content source, and then put them into an Astro project to test content rendering, I had a basic idea going.

Just a little bit prior to this, I’d checked out Notion as a way to store notes in nice organized hierarchies. And then I start using the editor, and wow, cool, these components that you can drag and drop and reconfigure start becoming really convenient to write in. And basic tabular data, too? Suddenly I want to start using Notion to write more stuff.

I could write blogs in a Notion database, and then export said database as a series of Markdown files, where I’m free to use them as an Astro content collection. Hey, wait, if this scheme can scale, maybe people at work can use Notion to rapidly develop documentation for work in the same way!

So, welcome to that daydream come to life. The basic architecture of my blog is as described above: the Astro project starts content-less, and on pre-build, will launch a script that fetches upstream content from a configured database. As it performs the fetch of each row, it constructs a md file where the frontmatter is all available tabular metadata (plus any additional a-la-minute metadata), plus the markdown content found in the child database page. At the end of this process, it saves all imported docs to disk under the project’s special src/content directory that Astro uses to construct content collections.

Really the only major innovation over the most basic of Astro use cases is the implementation of the Notion importer, which leans heavily on the open source notion-to-md project to import blocks as Markdown strings. The package also comes with a really nice way to transform specific Notion blocks, for example giving me the ability to store and cache images as they are parsed.

There’s one more piece that came later, which is that the importer stores the fetched metadata and content in a Strapi instance. I added this piece to learn more about Strapi, but also to stop a naive fetch of every Notion page on every build.

So really, the fetch script that builds the Astro content collection first goes to Strapi, checks for existing blog metadata and content, and uses that metadata against Notion to determine if any blogs were added/deleted/edited. Then, it will update Strapi with the new data. This is still a very rough and naive script, but it does the job at a small scale and allows me to work completely in Notion while letting me customize the content rendering any way I see fit.

I really did learn a lot by building this system, and I really like Astro’s ease of use. The actual code I had to write to power this site is pretty minimal (dependencies notwithstanding…), and the main advantage of using Notion is that I can write blogs from my phone, too. The main disadvantage being of course that I can run into content ownership issues in the future.