Your First Build System
How can something as simple as cc main.c && ./a.out
be a build system? It actually implements all of the basic features of a build system! It defines:
- the inputs:
main.c
, - the tasks to perform:
cc
,a.out
, - and, the relationship between the tasks:
&&
.
Over the course of this guide we'll slowly expand and revise this definition, but this is a fine place to start.
The Inputs
In any build system we must tell our tasks what to operate on. In this example that is the source code, main.c
. In spite of this example's apparent simplicity, defining the inputs is actually the most-difficult part of telling a build system what you want it to do.
We software developers write software to take input—from users, from systems—and process that input into new information that benefits us. A build system is no different. This gets hard precisely because we generally get better outputs with more (and better) inputs.
The Tasks
In our simple build system example we need to compile the code, and we need to run it to inspect the output. Many computers ship with a C compiler, cc
, making it possible for us to simply call it.
And, since a.out
is a file generated by cc
, we know that task will exist by the time we invoke it. How do we know? Because of the task relationships.
The Relationships
There are actually two pieces of information encoded in &&
. The first is order, and the second is an execution condition. In a terminal interface &&
executes the left-hand-side (cc main.c
) and, if it succeeds, executes the right-hand-side (./a.out
):
cc main.c
is run.- The terminal checks to see if the command succeeded.
- Success! It runs
a.out
. - Failure! Execution stops and the terminal does not run
a.out
.
- Success! It runs
Ordering and conditional execution of tasks is extremely important for a build system. By inspecting where we ended up, we can figure out how exactly we got there. Our simple build system guarantees two facts at the end of execution:
- If
cc
compilation fails, it will not executea.out
. - If
a.out
is executed, it will always run the most-recent version of the code.
Being able to make a guarantee about the state of our simple build system also means that we can extend the same guarantees all the way to the most-complex software in the world.
Visualizing the Build
To make it a little bit more clear, we can also think about our simple build system as a flow chart.
This particular type of flow chart—where every line connecting things is ordered and nothing points in a circle—has a special name: a Directed Acyclic Graph. This is colloquially referred to as a "DAG". It has the unique property of being able to look at any node in the graph to identify what has already happened, and what could happen next.
A DAG is defined by the thing that makes it unique: that it doesn't contain cycles. This is in contrast with a "standard" Directed Graph which may contain cycles. You can see how we can make no guarantees about what has already happened in the following example of a dishwasher. To know identify which thing happened last you must manually inspect the state of the dishes inside the dishwasher:
Our simple build system's DAG is a good mental model for how a build system works, but industrial-grade build systems can do a lot more. They do that primarily by becoming even more granular.