# Fifty shades of formalism

A weird idea somehow managed to sneak into my brain, that most "things" in compsci are much simpler and more straightforward than most people believe. In this article, I list some examples that pop off my head right now.

# What is a "thing"?

I can't say for sure, either. In this article, I focus on the mathematical model, and the semantics of the computation. To clear possible ambiguities, I have to use the sequent calculus. (I don't have a good command of it, so I refrain from talking too much about it.)

I will talk about

- Computation;
- Distributed systems;
- Compiler.

A central concept in compsci is "relation". To build a relation between two sets, we can turn to the sequents. The sequents are a precise way to define computation. The sequent calculus is a much, much shorter and conciser way to write a program. But you are on your own to prove your "program" does anything meaningful. In this article, such argument is omitted. I hereby ask the reader to check the details, and tell me if you find anything wrong!

# Computation

The reader is assumed to have basic familiarity with common computation models.

Mathematical functions are pure and total, and the easiest way to formalize "computation" is by reducing a function computation into combination of smaller "obviously mechanical" steps. Lambda calculus, though computationally universal, is regarded less ideal because of its beta reduction, as beta's computation is not obviously mechanical.

## Recursive functions

The problem of computation is therefore divided into three parts:

- Representation of the input
- Combination of sub-computations
- Representation of the output

The base case of the recursion is the smallest computations. This view leads us to the "recursive functions" model, pioneered by Kurt Gödel, if we are satisfied with the input and the output represented by natural numbers.

Recursive functions are purely mathematical, so there's no need to rewrite them as sequents.

## State machines

The Turing machine looks more mechanical. It's formalism is well-known, so I don't repeat it here, but the automaton theory (and compiler theory?) revolutionized our view about computation, and even extended "computation" beyond Turing machines, so let's take a look.

A function $f$ over the domain $D$ can be viewed as a possibly infinite set $\lbrace (x, f(x)) | x \in D \rbrace$. (To save words, I will see $f$ as a set from now on.) To compute $f$, we just pick the right pair out of the set. It's not immediately obvious how we do that, but we can proceed by lifting the restriction of the elements' shape. Consider a more general set $S = \lbrace a_1, a_2, \dots \rbrace$, we say some abstract machine *decides* $S$ if it can always correctly answer this question: does the input belong to the set $S$? This problem is easier, as long as we have a systematic method to enumerate the elements of $S$.

Did you notice the recursion? The latter problem can be rephrased exactly as the first problem. Nevertheless, we forget the first problem for a while.

How do we enumerate $S$? At first, there must be an initial element to begin with, and there must be a method to generate a new element from what we already know. Let's formalize this intuition.

- State, (a.k.a. "what we already know"), denoted by $s$
- Initial state, (a.k.a. "the initial element"), denoted by $i$
- Set of states, denoted by $\Sigma$
- Set of initial states, denoted by $\mathcal{I}$

And there must be some way to describe the state transitions. We denote "$s_1$ can safely transition to $s_2$" by $s_1 \rightarrow s_2$. Sometimes, the arrow can carry more information, like the type of transition or conditions. The most important thing now is defining what the state is. The arrow actually defines a relation between states! That's where the sequent calculus comes into play.

We can write a stack-based computer easily.

- The state is $(i, s)$, where $i$ is the program counter, and $s$ is the stack (represented by a list).
- The special state $e$ denoting the machine has some problems.
- The special state $h$ denoting the machine has halted successfully.
- The initial state $i_0$ is
- Instruction function $f : \mathbb{N} \rightarrow \mathbb{I}$, where $\mathbb{I}$ is the set of instructions.

$$ \frac {f(i)=\text{LOAD}[x] \quad x \in \mathbb{N}} {(i, s) \rightarrow (i+1, x:s)} \text{Load} $$

$$ \frac {f(i)=\text{ADD} \quad s=a:b:s'} {(i, s) \rightarrow (i+1, (a+b):s')} \text{Add} $$

$$ \frac {f(i)=\text{ADD} \quad s=a:\emptyset \lor s=\emptyset} {(i, s) \rightarrow (i+1, e)} \text{Add-Err} $$

$$ \frac {f(i)=\text{HALT}} {(i, s) \rightarrow (i, h)} \text{Halt} $$

Therefore, the $\rightarrow$ relation defines precisely how the computer performs computations.

Back to the original problem, now that we know how to enumerate sets, can we enumerate $f$? Not that easy. Suppose we have a machine that computes $f$, one thing that is easy is to check if every step is valid. This inspires us to extend $f$ from $\lbrace (x, f(x)) \rbrace$ to $\lbrace \text{COMPUTATION-OF}_f(x) \rbrace$. $\text{COMPUTATION-OF}_f(x)$ is a list of states ending with a special state (e.g. HALT), and it's easy to check if every transition is valid. This is, again, a relation, defined by the machine that computes $f$. The art is therefore transformed into designing a (state) machine where every step is easily checked. In this regard, there is nothing that can beat the Turing machine.

# Distributed systems

Once things get distributed, you know they're going bad! Why? Every machine has its own view of the system, and it's not obvious at all how to make the machines coordinate. But a distributed system is simple, albeit theoretically.

- Some machines, or processors, $p_i \in P$.
- Some events that we care, $e_i \in E$.
- Ideal wall clock, $t \in T$. (Don't care how it's represented, only require it to be monotonically increasing.)
- Event $e_i \in E$. Optionally tagged with the initiator and/or the executor.
- Local history of $p_i$, denoted by $h(p_i, t)$, is a list of events that are observed by $p_i$.
- Local state, $s_i \in E$.

Most of these are simply replication of human's intuition. And then we can define state transitions, driven by events. This is no different from computation.

# Compiler

The compiler is truly a beast. Lexer, parser, analysis, optimization, architecture... There are a ton of engineering opportunities. How do you verify a compiler is correct? CompCert proved its compiler satisfies the following theorem.

Semantic preservation theoremFor all source programs S and compiler-generated code C, if the compiler, applied to the source S, produces the code C, without reporting a compile-time error, then the observable behavior of C improves on one of the allowed observable behaviors of S.

The basic idea is you define what is an observable side-effect (state-changing operation), which typically means writes to the memory or I/O, and then proves the transformed code has the same behavior as the original source code. Again, very similar to the methods above, define a relation between them.

# Conclusion

Therefore, the universal way of modeling compsci concepts is to use state machines: define what is the state, and define their transitions. This method is flexible, too. You can decide how precise your state machine is. Most of the confusion comes from an ambiguous definition of the state or the transition. However, the state machine model + sequent calculus combo is powerful in explaining things, but not so useful as a reasoning tool. (Unlike mathematical equations, after a problem is modeled with an equation, you have access to tons of tools to solve it.) That's why people still need to prove theorems about them.

Not until here did a paper by Leslie Lamport occur to me. Lamport wrote about state machines in 2008 and despised the compsci community's obsession with languages. I read this paper a long time ago, but never resonated with it at that time (now I do). As a PL enthusiast, I'm not sure what he was talking about: is he critical of the syntax obsession? But there's no obsession with the syntax! The study about PL semantics has been based on state machines for a very long time, operational semantics in particular. I don't mean you shouldn't read it, you should! He explains things much better than I do, and I lost interest in polishing this article after I recalled that paper...

Loading Disqus...(Alternatively, drop me an E-mail to comment on this post.)