Up next on markdown Z

A walkthrough of parsing markdown with a Zig library

June 11, 2026

Parsing markdown is tricky. Markdown parsing is mostly about recognizing line-based block containers first, then interpreting character-level inline syntax second. This makes parsers less like typical programming-language parsers and more like layered document parsers.

In contrast, a programming language parser is about turning a flat token stream into a nested syntax tree according to a grammar.

The CommonMark spec explicitly describes its model of two-pass/phase parsing.¹

1.
first determine the document’s block structure, then
2.
parse inline text inside paragraphs, headings, and other blocks.

Link reference definitions are collected during the block phase and used later during inline parsing.

MDZ

As a library, markdown z parses to an AST first, then exposes that same tree through an event iterator or the HTML renderer depending on your use-case.

markdownz

A commonmark compliant markdown parser and event api

https://codeberg.org/desertthunder/markdownz

Parsing Pipeline

At a glance, markup goes through this pipeline:

from source bytes...

1.
block parser splits source into Line records
2.
block parser builds block nodes with raw text children
3.
inline parser replaces raw text with inline nodes
4.
callers use the AST, event iterator, or HTML renderer

Blocks

Blocks (Headings, Lists, Block Quotes, etc.) are line-oriented. We scan the document line by line and decide what each line begins, continues, or closes. When running through blocks the complexity arises with respect to precedence rules, requiring us to address:

nested lists
lazy continuation lines
blockquotes inside lists
distinguishing indented code from list continuation
when a paragraph should be interrupted by another block

In mdz, Block parsing produces a Line structs with a LineCursor tracking the byte index & visual column

const Line = struct {
    start: usize,
    content_end: usize,
    end: usize,
    source_start: usize,
    source_content_end: usize,
    text: []const u8,
};

const LineCursor = struct {
    line_index: usize,
    byte_index: usize = 0,
    column: usize = 0,
};

References

Markdown supports reference-style links. During block parsing, the parser collects labels and identifiers i.e. [Label][id] such that [id] resolves to whatever URL/path its linked to.

Inlines

Once blocks are known, we parse inline content inside headings, paragraphs, table cells, etc. Emphasis parsing is the trickiest part here because we have to track delimiters. Markdown emphasis depends on whether * or _ can open, close, or both, based on surrounding characters. For this, we use a container stack

Container Stack

The container stack parser tracks open block structures: document, block quote, list, item, paragraph, code, and HTML block. On each line the parser first calls matchOpenContainers and tries to continue the existing stack before starting anything new. This is why block quote and list code feels inverted at first. The parser is not only asking “what starts on this line?” It first asks “which of the containers from the previous line are still open?”²

When a line no longer matches an open container, the parser shrinks the stack. New block recognizers then run under the deepest matched parent.

Output

A parser commonly produces either an AST or an event-stream.

An AST is good for transformations, linting, editor tooling, and custom renderers, especially powerful when you need to understand, inspect, modify, or relate parts of a document due to their addressability.³

Event streams are good for fast rendering because they avoid building a whole tree before output starts.

In fact, it's a whole appendix.
https://spec.commonmark.org/0.31.2/#appendix-a-parsing-strategy

This article outlines why nested containers are so challenging to work with
https://jackdewinter.github.io/2022/01/24/markdown-linter-why-nested-containers-are-so-difficult/

This is the whole conceit behind the extendable remark ecosystem.
https://github.com/remarkjs/remark

compilers & resumania

markdown

zig

commonmark

parsers

oh hey, it's owais

I'm an open source developer regularly posting updates about what I'm working on and learning about. https://github.com/desertthunder