Spaces:
Sleeping
Sleeping
# Typst Compiler Architecture | |
Wondering how to contribute or just curious how Typst works? This document | |
covers the general structure and architecture of Typst's compiler, so you get an | |
understanding of what's where and how everything fits together. | |
## Directories | |
Let's start with a broad overview of the directories in this repository: | |
- `crates/typst`: The main compiler crate which defines the complete language | |
and library. | |
- `crates/typst-cli`: Typst's command line interface. This is a relatively small | |
layer on top of the compiler and the exporters. | |
- `crates/typst-eval`: The interpreter for the Typst language. | |
- `crates/typst-ide`: Exposes IDE functionality. | |
- `crates/typst-kit`: Contains various default implementation of | |
functionality used in `typst-cli`. | |
- `crates/typst-layout`: Typst's layout engine. | |
- `crates/typst-library`: Typst's standard library. | |
- `crates/typst-macros`: Procedural macros for the compiler. | |
- `crates/typst-pdf`: The PDF exporter. | |
- `crates/typst-realize`: Typst's realization subsystem. | |
- `crates/typst-render`: A renderer for Typst frames. | |
- `crates/typst-svg`: The SVG exporter. | |
- `crates/typst-syntax`: Home to the parser and syntax tree definition. | |
- `crates/typst-timing`: Performance timing for Typst. | |
- `crates/typst-utils`: Utilities for Typst. | |
- `docs`: Generates the content of the official | |
[documentation][docs] from markdown files and the inline | |
Rust documentation. Only generates the content and structure, not the concrete | |
HTML (that part is currently closed source). | |
- `tests`: Integration tests for Typst compilation. | |
- `tools`: Tooling for development. | |
## Compilation | |
The source-to-PDF compilation process of a Typst file proceeds in four phases. | |
1. **Parsing:** Turns a source string into a syntax tree. | |
2. **Evaluation:** Turns a syntax tree and its dependencies into content. | |
3. **Layout:** Layouts content into frames. | |
4. **Export:** Turns frames into an output format like PDF or a raster graphic. | |
The Typst compiler is _incremental:_ Recompiling a document that was compiled | |
previously is much faster than compiling from scratch. Most of the hard work is | |
done by [`comemo`], an incremental compilation framework we have written for | |
Typst. However, the compiler is still carefully written with incrementality in | |
mind. Below we discuss the four phases and how incrementality affects each of | |
them. | |
## Parsing | |
The syntax tree and parser are located in `crates/typst-syntax`. Parsing is | |
a pure function `&str -> SyntaxNode` without any further dependencies. The | |
result is a concrete syntax tree reflecting the whole file structure, including | |
whitespace and comments. Parsing cannot fail. If there are syntactic errors, the | |
returned syntax tree contains error nodes instead. It's important that the | |
parser deals well with broken code because it is also used for syntax | |
highlighting and IDE functionality. | |
**Typedness:** | |
The syntax tree is untyped, any node can have any `SyntaxKind`. This makes it | |
very easy to (a) attach spans to each node (see below), (b) traverse the tree | |
when doing highlighting or IDE analyses (no extra complications like a visitor | |
pattern). The `typst::syntax::ast` module provides a typed API on top of | |
the raw tree. This API resembles a more classical AST and is used by the | |
interpreter. | |
**Spans:** | |
After parsing, the syntax tree is numbered with _span numbers._ These numbers | |
are unique identifiers for syntax nodes that are used to trace back errors in | |
later compilation phases to a piece of syntax. The span numbers are ordered so | |
that the node corresponding to a number can be found quickly. | |
**Incremental:** | |
Typst has an incremental parser that can reparse a segment of markup or a | |
code/content block. After incremental parsing, span numbers are reassigned | |
locally. This way, span numbers further away from an edit stay mostly stable. | |
This is important because they are used pervasively throughout the compiler, | |
also as input to memoized functions. The less they change, the better for | |
incremental compilation. | |
## Evaluation | |
The evaluation phase lives in `crates/typst/src/eval`. It takes a parsed | |
`Source` file and evaluates it to a `Module`. A module consists of the `Content` | |
that was written in it and a `Scope` with the bindings that were defined within | |
it. | |
A source file may depend on other files (imported sources, images, data files), | |
which need to be resolved. Since Typst is deployed in different environments | |
(CLI, web app, etc.) these system dependencies are resolved through a general | |
interface called a `World`. Apart from files, the world also provides | |
configuration and fonts. | |
**Interpreter:** | |
Typst implements a tree-walking interpreter. To evaluate a piece of source, you | |
first create a `Vm` with a scope stack. Then, the AST is recursively evaluated | |
through trait impls of the form `fn eval(&self, vm: &mut Vm) -> Result<Value>`. | |
An interesting detail is how closures are dealt with: When the interpreter sees | |
a closure / function definition, it walks the body of the closure and finds all | |
accesses to variables that aren't defined within the closure. It then clones the | |
values of all these variables (it _captures_ them) and stores them alongside the | |
closure's syntactical definition in a closure value. When the closure is called, | |
a fresh `Vm` is created and its scope stack is initialized with the captured | |
variables. | |
**Incremental:** | |
In this phase, incremental compilation happens at the granularity of the module | |
and the closure. Typst memoizes the result of evaluating a source file across | |
compilations. Furthermore, it memoizes the result of calling a closure with a | |
certain set of parameters. This is possible because Typst ensures that all | |
functions are pure. The result of a closure call can be recycled if the closure | |
has the same syntax and captures, even if the closure values stems from a | |
different module evaluation (i.e. if a module is reevaluated, previous calls to | |
closures defined in the module can still be reused). | |
## Layout | |
The layout phase takes `Content` and produces one `Frame` per page for it. To | |
layout `Content`, we first have to _realize_ it by applying all relevant show | |
rules to the content. Since show rules may be defined as Typst closures, | |
realization can trigger closure evaluation, which in turn produces content that | |
is recursively realized. Realization is a shallow process: While collecting list | |
items into a list that we want to layout, we don't realize the content within | |
the list items just yet. This only happens lazily once the list items are | |
layouted. | |
When we a have realized the content into a layoutable element, we can then | |
layout it into _regions,_ which describe the space into which the content shall | |
be layouted. Within these, an element is free to layout itself as it sees fit, | |
returning one `Frame` per region it wants to occupy. | |
**Introspection:** | |
How content layouts (and realizes) may depend on how _it itself_ is layouted | |
(e.g., through page numbers in the table of contents, counters, state, etc.). | |
Typst resolves these inherently cyclical dependencies through the _introspection | |
loop:_ The layout phase runs in a loop until the results stabilize. Most | |
introspections stabilize after one or two iterations. However, some may never | |
stabilize, so we give up after five attempts. | |
**Incremental:** | |
Layout caching happens at the granularity of the element. This is important | |
because overall layout is the most expensive compilation phase, so we want to | |
reuse as much as possible. | |
## Export | |
Exporters live in separate crates. They turn layouted frames into an output file | |
format. | |
- The PDF exporter takes layouted frames and turns them into a PDF file. | |
- The SVG exporter takes a frame and turns it into an SVG. | |
- The built-in renderer takes a frame and turns it into a pixel buffer. | |
- HTML export does not exist yet, but will in the future. However, this requires | |
some complex compiler work because the export will start with `Content` | |
instead of `Frames` (layout is the browser's job). | |
## IDE | |
The `crates/typst-ide` crate implements IDE functionality for Typst. It | |
builds heavily on the other modules (most importantly, `syntax` and `eval`). | |
**Syntactic:** | |
Basic IDE functionality is based on a file's syntax. However, the standard | |
syntax node is a bit too limited for writing IDE tooling. It doesn't provide | |
access to its parents or neighbours. This is fine for an evaluation-like | |
recursive traversal, but impractical for IDE use cases. For this reason, there | |
is an additional abstraction on top of a syntax node called a `LinkedNode`, | |
which is used pervasively across the `ide` module. | |
**Semantic:** | |
More advanced functionality like autocompletion requires semantic analysis of | |
the source. To gain semantic information for things like hover tooltips, we | |
directly use other parts of the compiler. For instance, to find out the type of | |
a variable, we evaluate and realize the full document equipped with a `Tracer` | |
that emits the variable's value whenever it is visited. From the set of | |
resulting values, we can then compute the set of types a value takes on. Thanks | |
to incremental compilation, we can recycle large parts of the compilation that | |
we had to do anyway to typeset the document. | |
**Incremental:** | |
Syntactic IDE stuff is relatively cheap for now, so there are no special | |
incrementality concerns. Semantic analysis with a tracer is relatively | |
expensive. However, large parts of a traced analysis compilation can reuse | |
memoized results from a previous normal compilation. Only the module evaluation | |
of the active file and layout code that somewhere within evaluates source code | |
in the active file needs to re-run. This is all handled automatically by | |
`comemo` because the tracer is wrapped in a `comemo::TrackedMut` container. | |
## Tests | |
Typst has an extensive suite of integration tests. These tests cover parsing, | |
evaluation, realization, layout, and rendering. PDF output is sadly untested so | |
far, but most bugs are in earlier phases of the compiler. For more details about | |
testing, see the [tests directory](/tests) and its README. | |
[docs]: https://typst.app/docs/ | |
[`comemo`]: https://github.com/typst/comemo/ | |