Back to writing
Engineering 6 min read

Wrangling a blog with Astro Content Collections

How I set up a typed, validated blog pipeline in Astro — indexing, tags, categories, and the kind of schema errors that actually help instead of yell.

I wanted a blog on this site, but I didn’t want a CMS, a database, or a second service to pay for. I wanted to write in Markdown, ship it with the rest of the site, and have the build yell at me if I forgot a field. Astro’s Content Collections turned out to be exactly that — a typed, validated, zero-runtime way to treat a folder of .mdx files like a real data source.

01. What Content Collections actually are

A collection is just a folder inside src/content/ with a schema attached to it. You describe the shape of your frontmatter with Zod, and Astro gives you back a fully typed API for querying those files at build time. No fetch calls, no client-side JavaScript, no runtime parsing — everything gets resolved when the site builds.

The mental model is simple: the folder is your “table,” each .md or .mdx file is a “row,” and the schema is your migration. If a post is missing a field or has a typo in a category name, the build fails with a readable error instead of shipping a broken page. That alone sold me on it.

Why not just glob the folder?

You can. I did, on a previous project. But once you’ve got more than five or six posts, you end up wanting: sorted indexes, category filters, tag pages, a type-safe way to link between posts, and frontmatter that doesn’t silently drift. Rolling that by hand is fine until the first time you ship a post with catagory: "notes" and wonder why the filter broke.

02. Getting started

If you’re on a fresh Astro project, Content Collections are built in — no extra package needed for plain Markdown. (If you want MDX like I’m using here, add @astrojs/mdx to the integrations array.) From there, you just need a config file at the right path.

# Create the collection directory and the config file
mkdir -p src/content/blog
touch src/content.config.ts

Then define the schema. Mine looks roughly like this:

import { defineCollection, z } from "astro:content";
import { glob } from "astro/loaders";

const blog = defineCollection({
  loader: glob({ pattern: "**/*.{md,mdx}", base: "./src/content/blog" }),
  schema: z.object({
    title: z.string(),
    excerpt: z.string(),
    category: z.enum(["design", "process", "systems", "engineering", "notes"]),
    date: z.coerce.date(),
    read: z.number().int().positive(),
    tags: z.array(z.string()).default([]),
    featured: z.boolean().default(false),
  }),
});

export const collections = { blog };

The z.enum on category is the part I care about most. It means “notes” is valid, “note” is not, and I find out at build time instead of at 2 a.m. when I notice the filter chip is empty.

03. The index page

Once the schema is in place, building the index is almost boring — which is the compliment I mean it to be.

import { getCollection } from "astro:content";

const posts = (await getCollection("blog"))
  .sort((a, b) => b.data.date.valueOf() - a.data.date.valueOf());

const categories = [...new Set(posts.map((p) => p.data.category))];

That’s the whole backbone. posts is fully typed — hovering over p.data.category in VS Code shows me the exact union of allowed values. From there it’s just rendering: a hero, category filter chips, a grid of cards, and an archive grouped by year. No API, no loading states, no hydration. It’s static HTML with a sprinkle of client-side filter logic for the chips.

“Astro’s content layer turned my blog from a folder I was scared to touch into a data source I actually enjoy querying.” — Me, after the third rebuild worked first try

04. The post page itself

Individual posts live at src/pages/blog/[slug].astro and use a dynamic route. Astro’s getStaticPaths walks the collection, generates a page per entry, and hands the MDX component to the layout:

import { getCollection, render } from "astro:content";

export async function getStaticPaths() {
  const posts = await getCollection("blog");
  return posts.map((post) => ({
    params: { slug: post.id },
    props: { post },
  }));
}

const { post } = Astro.props;
const { Content } = await render(post);

Then <Content /> renders the MDX body through whatever prose styles the layout applies. Because the body is MDX, I can drop components into a post when I need to — a custom callout, a figure with a caption, an interactive demo — without leaving the content file.

05. What this unlocked

The honest win is that writing feels frictionless again. I open a new .mdx file, fill in the frontmatter (with autocomplete, thanks to the generated types), write the post, and commit. The index, the category pages, the archive, the featured flag — all of it updates from that one file.

The less obvious win is confidence. I know the shape of every post on the site. I know the build will fail before a broken one ships. And when I eventually want to add something — reading time estimation, related posts, an RSS feed — I already have a clean, typed dataset to build on top of.


06. Closing

If you’re building anything content-heavy in Astro and you’ve been putting off structuring it — stop putting it off. Content Collections are the rare “correct default” that also happen to be pleasant to use. Start with a loose schema, tighten it as you learn what your content actually looks like, and let the type system carry the boring parts.

Next post I’ll probably get into how the prose styling and Shiki themes are wired up, because that was its own small rabbit hole.