quebex

A software analysis framework built around the QBE intermediate language

git clone https://git.8pit.net/quebex.git

  1<!--
  2SPDX-FileCopyrightText: 2025 Sören Tempel <soeren+git@soeren-tempel.net>
  3
  4SPDX-License-Identifier: GPL-3.0-only
  5-->
  6
  7## README
  8
  9A work-in-progress software analysis framework built around the [QBE] intermediate language.
 10
 11### Motivation
 12
 13Existing analysis frameworks are predominantly built around [LLVM].
 14Unfortunately, LLVM is a fast-moving target with constant changes and updates to its intermediate representation.
 15Therefore, tooling built on LLVM often requires dated LLVM versions (e.g., [KLEE] currently [recommends LLVM 13][KLEE LLVM] released in 2022).
 16Obtaining these LLVM versions can be cumbersome and often hinders employment of these tools.
 17To overcome these issues, maintainers of analysis tooling need to constantly invest time to catch up with LLVM releases instead of focusing on improving their analysis framework.
 18
 19In order to reduce the maintenance burden, this project attempts to investigates the utilization of another intermediate language for software analysis: [QBE].
 20QBE is a much [smaller-scale project][QBE vs LLVM] than LLVM and thereby offers a higher degree of stability.
 21Further, QBE is simpler than LLVM (e.g., providing fewer operations) and thereby eases the implementation of analysis techniques.
 22Nonetheless, there exist compiler frontends that can emit a representation in the QBE intermediate representation (which can then be analyzed using quebex!).
 23For example, [SCC], [cproc], or the [Hare compiler][Hare].
 24
 25### Status
 26
 27Proof of concept, not much to see here yet.
 28Come back later!
 29
 30### Installation
 31
 32The framework consists of several components.
 33After cloning the repository, individual components can be installed using `cabal install`.
 34However, presently specific GHC versions are required; therefore, installation using [Guix] is recommended.
 35For example, in order to install the `quebex-cli` component using Guix:
 36
 37```
 38$ guix time-machine -C .guix/channels.scm -- install -L .guix/modules/ quebex-cli
 39```
 40
 41Afterwards, if Guix is configured correctly, `quebex-cli` should be available in your `$PATH`.
 42The following section demonstrates usage of `quebex-cli`.
 43
 44### Demonstration
 45
 46This framework is primarily *intended to be used as a library*, allowing the implementation of both static and dynamic analysis techniques based on QBE.
 47Presently, it focuses on dynamic analysis, and sufficient documentation of the library interface is lacking.
 48Nonetheless, it is already capable of executing QBE representations of medium-complexity C code (e.g., as emitted by [cproc]).
 49To experiment with the current capabilities, a command-line tool called `quebex-cli` is available that provides a frontend to quebex's [symbolic execution] library.
 50Symbolic execution is a dynamic software analysis technique that explores reachable program paths based on a symbolic input variable.
 51For example, consider the following C program:
 52
 53```C
 54#include <stdio.h>
 55#include <stddef.h>
 56
 57extern void quebex_symbolic_array(void *ptr, size_t nelem, size_t elsiz, const char *name);
 58
 59int main(void) {
 60	puts("<path>");
 61
 62	int a;
 63	quebex_symbolic_array(&a, 1, sizeof(a), "a");
 64	if (a == 42) {
 65		puts("you found the answer");
 66	} else {
 67		puts("not the answer");
 68	}
 69
 70	puts("</path>");
 71	return 0;
 72}
 73```
 74
 75This program can be compiled using [cproc] as follows:
 76
 77```
 78$ cproc -emit-qbe example.c
 79```
 80
 81The resulting QBE representation (`example.qbe`) can be symbolically executed using quebex:
 82
 83```
 84$ quebex-cli example.qbe
 85```
 86
 87This will yield the following output:
 88
 89```
 90<path>
 91not the answer
 92</path>
 93<path>
 94you found the answer
 95</path>
 96
 97Amount of paths: 2
 98```
 99
100This tells us that quebex found two paths through our program based on the symbolic variable `a`.
101In the future, it will be possible to obtain test inputs for each path in a standardized format using `quebex-cli`, which can then be used to automatically [generate high-coverage tests][KLEE OSDI].
102However, for now the focus is on improving the library, not the command-line interface.
103
104### Design Goals
105
106This project is intentionally written in a simple subset of the [Haskell] programming language.
107It should be usable by anyone with a basic Haskell background (e.g., as obtained by reading [Learn You a Haskell for Great Good!][learnyouahaskell]).
108Further, the project should require minimal long-term maintenance and should also support older GHC versions.
109Therefore, it uses the [GHC2021] language standard and avoids usage of additional language extensions.
110Further, whenever possible, dependencies on external libraries that are [not bundled by GHC][GHC libraries] must be avoided.
111
112### Development
113
114Code should be formatted using [ormolu][ormolu github].
115Git hooks performing several sanity checks, including ensuring the proper code formatting, are available.
116These hooks can be enabled using:
117
118	$ git config --local core.hooksPath .githooks
119
120Further, a [Guix] environment for development purposes can be obtained using:
121
122	$ guix time-machine -C .guix/channels.scm -- shell -L .guix/modules/ -m .guix/manifest.scm
123
124### License
125
126This project uses the [REUSE Specification] to indicated used software license.
127
128[QBE]: https://c9x.me/compile/
129[QBE vs LLVM]: https://c9x.me/compile/doc/llvm.html
130[LLVM]: https://llvm.org/
131[KLEE]: https://klee-se.org
132[KLEE LLVM]: https://klee-se.org/build/build-llvm13/
133[KLEE OSDI]: https://www.usenix.org/legacy/events/osdi08/tech/full_papers/cadar/cadar.pdf
134[SCC]: https://www.simple-cc.org/
135[cproc]: https://sr.ht/~mcf/cproc/
136[Hare]: https://harelang.org/
137[Haskell]: https://haskell.org/
138[GHC]: https://www.haskell.org/ghc/
139[GHC2021]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/control.html#extension-GHC2021
140[GHC libraries]: https://ghc.gitlab.haskell.org/ghc/doc/libraries/index.html
141[learnyouahaskell]: https://learnyouahaskell.github.io/chapters.html
142[libriscv]: https://github.com/agra-uni-bremen/libriscv
143[ormolu github]: https://github.com/tweag/ormolu
144[REUSE Specification]: https://reuse.software/spec-3.3/
145[Guix]: https://guix.gnu.org
146[symbolic execution]: https://en.wikipedia.org/wiki/Symbolic_execution