Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

What is Relux?

Relux is an integration test framework for interactive shell programs. It takes a different approach from most end-to-end test frameworks: instead of writing imperative scripts that call APIs and poll for state changes, you describe the expected conversation between you and the system under test — what to send and what to expect back.

This style of testing is called Expect-style testing, named after the original Expect tool from 1990. The core idea is simple:

  1. Send input to a running program
  2. Expect (match) specific output
  3. React to what you matched — send more input, capture values, branch

Relux is inspired by hawk/lux (LUcid eXpect scripting) — an Erlang-based framework that showed Expect-style testing could be principled and composable, not just fragile scripts. Relux builds on that foundation with a block-structured DSL, an effect system for declarative dependency management, and a single standalone Rust binary with no runtime dependencies.

Who this tutorial is for

Anybody who found himself testing a real-world system might be a Relux user.

This tutorial assumes you are comfortable with:

  • Shell basics — what a shell is, how commands produce output, the read-eval-print loop
  • Regular expressions — character classes, quantifiers, anchors, capture groups
  • General testing concepts — what a test is, pass/fail, setup/teardown

No prior experience with Expect, lux, or Relux is assumed. The tutorial introduces every concept from scratch, one article at a time.

When to use Relux

Relux is designed for integration testing of systems with real dependencies. In most cases, you have a single system under test — a CLI, a service, a REPL — and its dependencies: databases, queue servers, other backend services. Instead of mocking those dependencies away, Relux lets you start them for real, so errors in both the system under test and its dependencies are not hidden behind mocks.

Most end-to-end test frameworks approach this with imperative code: start a process, sleep or poll until it’s ready, make HTTP calls, parse responses, tear down in a finally block. The test logic gets buried under orchestration boilerplate — process management, retry loops, health check polling, cleanup handlers.

Relux takes a declarative approach. You describe the startup order and dependencies between services as effects, and Relux resolves them into a dependency graph, starts them in the right order, and tears them down in reverse — even when things fail.

Relux is not, however, a general-purpose test framework. It does not replace unit test frameworks. It complements them by covering the system-level integration layer — the part where real processes start, interact, and produce observable output. It can also serve as an API testing tool (sending requests and matching responses through shell commands), though that is not its primary purpose.

Relux is not a screen scraper or a GUI testing tool either. It works at the level of text I/O through a PTY — it has no knowledge of screen layout, windows, or graphical elements.

The mental model

Imagine you are developing and testing a service. You open a few terminal windows: in one you start the database, in another you launch a message queue, in a third you run your service. Then you open yet another terminal and fire off a few requests — HTTP, gRPC, or just raw packets. After each request you glance at the logs: any errors? You check whether your service called its dependency correctly. You switch to the dependency terminals to make sure they didn’t error out either.

Relux does exactly what you do — but automated. Instead of doing it by hand, you write it down once, and Relux handles all the heavy lifting: starting processes, waiting for readiness, switching between shells, checking output. You are left with the thing that matters: the actual testing.

A Relux test reads like a transcript of that interaction:

test "echo and match" {
    shell s {
        > echo hello-relux
        <= hello-relux

        > echo "value=42"
        <= value=42
    }
}

The structure mirrors a conversation with a shell: send a command (>), match the response (<=), repeat. Every match operation has a timeout — if the expected output does not appear in time, the test fails. This is how Relux detects hangs, unexpected prompts, and wrong output.

Don’t worry about the syntax details yet — the following articles will introduce every element step by step.

The DSL at a glance

The example above uses just two operators (> and <=), but the Relux DSL has more to offer. Here is a glimpse of what you will learn in the following articles:

  • Regex matching (<?) — match output with regular expressions and capture groups
  • Variables (let, ${var}) — capture and reuse values
  • Timeouts (~5s, <~2s?) — control how long to wait for output
  • Fail patterns (!?, !=) — continuous monitoring for errors
  • Functions (fn) — extract reusable test logic
  • Effects (effect, start, expect, expose) — shared setup/teardown infrastructure
  • Multiple shells — test client/server interactions, concurrent processes
  • Modules and imports — organize tests across files
  • Condition markers ([skip], [run if ...]) — conditional test execution

Each article in this series introduces one concept, building on everything before it.


Next: Installation — get Relux built and ready to use

Installation

Previous: Introduction

Relux is distributed as a single binary with no runtime dependencies.

Prerequisites

cargo install and building from source require a working Rust toolchain. The recommended way to install it is through rustup:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Relux uses Rust edition 2024, which requires Rust 1.85 or later. If you already have Rust installed, make sure you are on a recent stable version:

rustup update stable

Install from crates.io

The simplest way to install Relux is via Cargo:

cargo install relux

This downloads, compiles, and installs the relux binary into ~/.cargo/bin/.

Pre-built binaries

Pre-built binaries for Linux (x86_64) and macOS (aarch64) are available on the Releases page. No Rust toolchain required.

Building from source

If you prefer to build from source:

git clone https://github.com/shizzard/relux.git
cd relux
cargo build --release

The binary will be at target/release/relux. You can copy it somewhere on your PATH, or run it directly from the build directory.

If you have just installed, you can also build with:

just release

Verifying the installation

Run relux without arguments to confirm the binary works:

relux

You should see the help output listing the available subcommands.

Shell completions

Relux supports tab completions for bash, zsh, and fish. To install them:

relux completions --install

This autodetects your shell and writes the completion script to the standard location. For zsh, you need to specify a path:

relux completions --shell zsh --install --path ~/.zsh/completions

See The CLI for details.


Next: Getting Started — scaffold a project, write and run your first test

Getting Started

Previous: Installation

A working example

Let’s go from an empty directory to a passing test. Let’s create a new project and add an integration test suite for it.

mkdir my-project && cd my-project
relux new

Now scaffold a test:

relux new --test hello

And run it:

relux run

You should see something like this:

test result: ok. 1 passed; 0 failed; finished in 132.7ms

That’s all it takes: three commands and you have a working test suite. Let’s unpack what happened.

Scaffolding a project

Unlike tools such as cargo new or npm init that create a new project directory, relux new works in the current directory. This is a deliberate choice — Relux is designed to add integration tests to an existing project, not to create a project from scratch.

Running relux new creates exactly two things at the project root:

my-project/
├── Relux.toml
└── relux/
    ├── .gitignore
    ├── tests/
    └── lib/
  • Relux.toml — the project manifest. Configures the shell, prompt, and timeouts.
  • relux/ — everything Relux-related lives here: tests, library modules, and test output.
    • tests/ — your test files go here (.relux files).
    • lib/ — shared modules: reusable functions and effects. Don’t pay too much attention here, we’ll get to this later.
    • .gitignore — ignores out/, the directory where Relux writes test run artifacts.

Two entities at the root — Relux.toml and relux/.

Relux.toml

The generated Relux.toml looks like this:

# name = "my-test-suite"

# [shell]
# command = "/bin/sh"
# prompt = "relux> "

# [timeout]
# match = "5s"
# test = "5m"
# suite = "30m"

Everything is commented out. The values shown are the defaults — Relux uses them when a field is not explicitly set. Let’s walk through each section.

name — an optional label for the test suite. If omitted, Relux uses the project directory name. In the example above, that would be ‘my-project’.

[shell] — configures the shell that Relux spawns for each shell it starts:

  • command — the shell binary to run. Defaults to /bin/sh.
  • prompt — the prompt string Relux configures in each spawned shell. Defaults to relux> .

[timeout] — controls how long Relux waits before declaring failure:

  • match — the default timeout for each match operation. If the expected output does not appear within this duration, the test fails. Defaults to 5s.
  • test — optional maximum duration for a single test. No limit by default.
  • suite — optional maximum duration for the entire test run. No limit by default.

All timeout values use humantime format: 100ms, 5s, 1m30s, 30m, etc.

For now, you can leave everything at the defaults, or change the values and see what happens.

Scaffolding test modules

You already saw relux new --test hello in the opening example. This command creates a test file at relux/tests/hello.relux with a starter test you can run immediately.

The path you provide maps directly to the filesystem under relux/tests/. You can use subdirectories to organize your tests:

relux new --test auth/login

This creates relux/tests/auth/login.relux (and the auth/ directory if it doesn’t exist).

Path rules:

  • Must be snake_case — lowercase letters, digits, and underscores only.
  • Each segment must start with a letter or underscore.
  • The .relux extension is added automatically — you don’t need to include it.

There is also relux new --effect, which scaffolds a module in relux/lib/ instead. Effects are shared test infrastructure, we’ll cover them in detail later.

Writing a test

Let’s look at what relux new --test hello generated:

test hello {
    shell myshell {
        > echo hello-relux
        <= hello-relux
    }
}

This test does three things:

  1. test hello — declares a test with a descriptive name.
  2. shell myshell { ... } — opens a shell block named myshell. Relux spawns a new /bin/sh process for this shell.
  3. Inside the shell block:
    • > echo hello-relux — sends the command echo hello-relux to the shell, followed by a newline (just like pressing Enter).
    • <= hello-relux — matches the output literally. Relux waits (up to the match timeout) for the string hello-relux to appear in the shell’s output. If it appears, the match succeeds and the test continues. If it doesn’t appear before the timeout, the test fails.

That’s the fundamental interaction loop: send a command, match the expected output.

Shells

A test can use more than one shell. Let’s add a second one:

test hello {
    shell myshell {
        > echo hello-relux
        <= hello-relux
    }

    shell anothershell {
        > echo hello-user
        <= hello-user
    }
}

This is like opening two terminal windows side by side. Relux enters the myshell window first, sends a command, and checks the output. Then it opens a new window called anothershell and does the same there. Each shell is an independent process — its own environment, its own working directory, its own output.

Switching between shells

Now consider a more realistic pattern: you want to start something in one shell and verify its effect in another. To do that, you interleave shell blocks:

test hello {
    shell myshell {
        > echo hello-relux
    }

    shell anothershell {
        > echo hello-user
    }

    shell myshell {
        <= hello-relux
    }

    shell anothershell {
        <= hello-user
    }
}

Here, both shells appear twice. The first time, Relux opens a new “terminal window” and sends the command. The second time, Relux switches back to the same “terminal window” — the process is still running, the output is still there — and matches the result.

This pattern — send in one shell, do something in another, come back to check — is the foundation of multiprocess integration testing. You’ll use it whenever you test interactions between a client and a server, a producer and a consumer, or any two processes that need to coordinate.

Running tests

You have two commands for working with tests:

relux check validates your test files without executing them. It runs the lexer, parser, and resolver — catching syntax errors, unresolved names, and invalid imports — but never spawns a shell. This is fast and useful as a quick sanity check, especially before committing.

relux run actually executes the tests:

relux run
test result: ok. 1 passed; 0 failed; finished in 12.5ms

You can also run a specific test file:

relux run -f relux/tests/hello.relux

Or a directory of tests:

relux run -f relux/tests/auth/

Best practices

Keep /bin/sh as the default shell

You might be tempted to configure your favorite shell — zsh, fish, bash — as the default in Relux.toml. After all, you use it every day and know its features well.

Resist the temptation. A custom shell means every developer on the team and every CI machine needs that shell installed and configured. /bin/sh is available everywhere, and the operations you need in integration tests — running commands, checking output, setting environment variables — work the same across POSIX shells. The interactive niceties of fancier shells (tab completion, syntax highlighting, advanced globbing) don’t matter when Relux is driving the terminal.

Only switch away from /bin/sh if your system under test genuinely requires a specific shell to function.

Leave timeouts at their defaults

The default match timeout of 5 seconds is generous for most commands. You might think “I’ll set timeout to 500 ms to speed up failure detection”. Don’t — not yet.

Timeout tuning is one of those things that should be driven by actual pain, not preemptive optimization. Tight timeouts cause flaky tests on slower machines or under CI load. The defaults are deliberately conservative. When you encounter a specific situation where the default is genuinely wrong — a command that reliably takes 30 seconds — that’s the time to tune. Relux provides fine-grained timeout control at the operator, shell, and test level, which you’ll learn about in later articles.

The shell prompt must be static

The prompt configured in Relux.toml (default: relux> ) must be a fixed, unchanging string. Do not include dynamic elements like timestamps, git branch names, hostnames, or user-specific paths.

Why this matters will become clear in later articles, but the short version is: Relux uses the prompt as a reliable marker in the shell output stream. A prompt that changes between commands — or between machines — makes that marker unpredictable, which leads to flaky or outright broken tests. The default relux> is a good choice: short, distinctive, and the same everywhere.

Try it yourself

Open relux/tests/hello.relux and experiment:

  1. Change the echo command to print something different. Update the match to expect the new output. Run the test — does it pass?
  2. Change only the match string so it no longer matches the output. Run the test and observe the failure — what does Relux tell you?
  3. Try matching a substring. For example, if you send echo hello-relux, try matching just hello. Does that work with <=?
  4. Add a second > and <= pair below the first. Send a different command (ping something?) and match its output.

The goal is to get comfortable with the edit-run-observe loop. Every test you’ll write in the rest of this series is built from this same foundation: send, match, repeat.


Next: Send, Match, and Logs — a deeper look at the fundamental operators and how to debug failures

Send, Match, and Logs

Previous: Getting Started

When you need exact control over what gets sent

The send operator (>) appends a newline to everything you send — just like pressing Enter. Most of the time that is what you want: send a command, let the shell execute it. But sometimes you need to send text without that trailing newline. Maybe you are building up a command from parts, or feeding input to an interactive prompt that does not expect a newline.

Raw send (=>) sends text to the shell exactly as written — no newline appended, nothing added. The shell receives the bytes and waits for more. You can chain multiple raw sends to assemble a command piece by piece:

test "multiple raw sends" {
    shell s {
        => echo one
        => -two
        > -three
        <= one-two-three
    }
}

Three separate operations build a single command:

  1. => echo one sends echo one (no newline — the shell is still waiting)
  2. => -two sends -two (still no newline)
  3. > -three sends -three followed by a newline

The shell now has echo one-two-three\n in its input buffer. It executes the command, and the literal match <= one-two-three picks up the output.

This example comes from tests/relux/tests/operators/send.relux in the Relux source tree (adapted from regex match to literal match for this article).

Reading test output

Let’s run the example and look at what Relux produces beyond the pass/fail result. Create a file relux/tests/raw_send.relux with the example from this article, then run it:

$ relux run -f relux/tests/raw_send.relux
running 1 tests
test raw_send.relux/multiple-raw-sends: |.... ok (5.8 ms)

test result: ok. 1 passed; 0 failed; finished in 5.8 ms

The line starting with test shows the test name, a progress string (|....), the result, and the duration. The progress string is a compact visual trace of what happened during execution:

  • | — a shell was opened
  • . — a send or successful match operation

So |.... means: open shell, then four operations (our two =>s, >, and <=).

The output directory

Every test run writes detailed logs to relux/out/. After running the test above, the directory looks like this (the RnwTRJ4AMY is the run id, and it would be different in your case):

relux/out/
├── latest -> run-2026-03-11-14-04-08-RnwTRJ4AMY
└── run-2026-03-11-14-04-08-RnwTRJ4AMY/
    ├── index.html
    ├── run_summary.toml
    └── logs/
        └── relux/tests/raw_send/
            └── multiple-raw-sends/
                ├── event.html
                ├── s.html
                ├── s.stdin.log
                ├── s.stdin.raw
                ├── s.stdout.log
                └── s.stdout.raw

Each run gets its own directory, named with a timestamp and a random ID. The latest symlink always points to the most recent run — so relux/out/latest/index.html is always the quickest way to the results.

Open relux/out/latest/index.html in a browser. The index page shows a summary table with one row per test: the test name, its result (pass/fail/skip), the duration, and the progress string. For a single passing test this is underwhelming, but when you have dozens of tests and one fails, the index is where you start — scan the results, click the failing test to jump to its event log.

Each test gets an event.html file that records every operation in a timeline: sends, matches, timeouts, shell switches. Each row shows a timestamp (relative to test start), the shell name, the event type, and the event data. Try clicking on the timestamp: it would bring you to the shell-specific event log, where you can only see events for this particular shell. Clicking on timestamp in the shell log would bring you back to the test log at that exact moment. It is very useful when you want to inspect what happened around that particular event in the shell.

For our passing test, the event log has four rows: the two raw sends (echo one and -two), the send of -three (with newline), and the successful match of one-two-three. Since the test only spawns one shell, the shell event log will have almost the same.

Alongside the event log, each shell produces four log files:

  • s.stdin.log — every command sent to the shell, with timestamps
  • s.stdout.log — everything the shell printed back, with timestamps
  • s.stdin.raw / s.stdout.raw — the same data but as raw bytes, without timestamps

The .log files are the ones you’ll read most often. Here is what s.stdout.log looks like for our test:

[+0.003s] export PS1='relux> ' PS2='' PROMPT_COMMAND=''
[+0.008s] relux> echo one-two-three
[+0.009s] one-two-three
[+0.009s] relux>

The first line is Relux configuring the shell prompt. Then the shell echoes the command, prints the output, and shows the prompt again. The timestamps let you see exactly when each piece of output arrived.

Error logs

Let’s go back to the simplest possible test — the one we started with:

test "echo and match" {
    shell s {
        > echo hello-relux
        <= hello-relux
    }
}

This sends echo hello-relux and matches the output hello-relux. Now break this test: duplicate the second match. Since first match would consume the “hello-relux” string, we will get a timeout:

test "echo and match" {
    shell s {
        > echo hello-relux
        <= hello-relux
        <= hello-relux
    }
}

Run it:

$ relux run -f relux/tests/hello.relux
running 1 tests
test hello.relux/echo-and-match: |... ok (9.7 ms)

test result: ok. 1 passed; 0 failed; finished in 9.7 ms

Wait, what? That is definitely a bug, this should not have worked — where did the second hello-relux come from? Let’s read the event log for this test (relux/out/latest/logs/relux/tests/hello/echo-and-match/event.html) and look at the two match rows.

The first match did not hit the output line hello-relux — it hit the echoed command echo hello-relux, which contains the substring hello-relux. The second match then found the actual output.

The shell echoes every command you send before printing its result. That echo is part of the output buffer, and <= matches anywhere in it. We have been matching our own commands this whole time!

This is not a bug — it is how PTY shells work. But it changes how you think about matching, and it is exactly what the next article is about.


Next: The Output Buffer — understand the buffer and cursor model that makes matching predictable

The Output Buffer

Previous: Send, Match, and Logs

A working example

The previous article ended with a discovery: the shell echoes every command you send, and <= matches anywhere in the buffer — including the echo. Here is a test that demonstrates how to account for that, consuming every piece of output a command produces:

test "full consumption" {
    shell s {
        > echo hello
        <= echo hello
        <= hello
        <= relux>
    }
}

Four operations, each doing exactly one thing:

  1. > echo hello — send the command.
  2. <= echo hello — match the echoed command.
  3. <= hello — match the actual output.
  4. <= relux> — match the prompt that appears after the command finishes.

After this sequence, the buffer is empty — every byte of the previous command’s lifecycle has been consumed. The shell is ready for the next command, and the buffer is in a known state.

This is the pattern you will use more than any other in Relux: send, match the result, match the prompt. It leaves the buffer clean before the next command.

The output buffer

When Relux spawns a shell, it starts collecting everything the shell prints into a byte buffer — the output buffer. Every character goes in: the prompt setup command, its output, the first prompt, every command echo, every line of output, every subsequent prompt.

When a match operation succeeds, everything up to and including the matched text is consumed — removed from the buffer. Future match operations only see what remains. The matched text is gone; it will never be found again.

The position where the next match starts searching is called the cursor. When a shell is first created, Relux configures the prompt and matches it internally, consuming the setup output. Your test begins with a clean buffer — the cursor sits right at the start, ready for the first command’s echo. Each successful match advances the cursor past the matched text. Everything before the cursor has been consumed and is invisible to future matches.

The cursor in action

Let’s trace the buffer and cursor through the working example. For clarity, the buffer content stays the same in these diagrams — only the cursor moves. After > echo hello, the buffer contains (simplified for clarity):

echo hello<newline>hello<newline>relux>
^cursor

<= echo hello scans forward from the cursor and matches the echoed command. The cursor advances past the match:

echo hello<newline>hello<newline>relux>
          ^cursor

<= hello scans forward and finds the actual output. The cursor advances:

echo hello<newline>hello<newline>relux>
                        ^cursor

<= relux> scans forward and finds the prompt. The cursor advances past it:

echo hello<newline>hello<newline>relux>
                                      ^cursor

The buffer is fully consumed. The next command’s echo will be the first thing the cursor sees.

What the cursor skips

When the match operator scans forward from the cursor, it does not care what sits between the cursor and the matching text. There may be prompts, blank lines, ANSI escape sequences, output from other commands — the scan skips over all of it, looking only for the pattern.

This means <= is a substring search, not an exact match. It finds the first occurrence of the pattern anywhere in the remaining buffer. If the pattern is short or generic, it might match something you did not intend — a prompt fragment, a piece of the echoed command, leftover output from a previous step. The cursor then lands in an unexpected place, and everything after it is wrong.

The prompt as your anchor

This is why the Getting Started article insisted on a static shell prompt. The prompt string relux> appears in the buffer after every command finishes. It is a reliable, predictable boundary marker.

When you match the prompt after matching a command’s output, you are telling Relux: “I am done with this command. Consume everything up to and including the prompt so the next operation starts at a clean boundary.”

Without that prompt match, the cursor sits somewhere after hello in the output — but before the prompt. The next match would have to scan past the prompt to find anything, and if the pattern happens to match part of the prompt itself, or the stale output, you are in trouble.

The output buffer is the single most important concept of the Relux DSL and runtime. When writing tests, you must always keep track of where the cursor is. Matching the prompt after each command is the simplest way to stay in control.

Buffer reset

Sometimes you genuinely do not care about the output — a command prints a long startup banner, or verbose logging that is irrelevant to the test. For these cases, Relux provides a buffer reset: a <= operator with no pattern. It consumes everything currently in the buffer. It is the equivalent of saying “I don’t care what happened, skip to now.”

We will see in the best practices section below why this operator should be used with caution.

Best practices

Always match the prompt

If you are done examining a command’s output, match the prompt. Every time. This is the single most effective habit for avoiding flaky tests.

Without a prompt match, the cursor floats somewhere in the middle of the output. When the buffer contains output from a previous command that was not fully consumed, any pattern that appears in that leftover output will match there first. The cursor advances to an unexpected position, and subsequent matches silently verify stale data.

Matching the prompt anchors the cursor at a command boundary. It is cheap, it is predictable, and it eliminates an entire class of timing-related failures.

In the next article we will introduce match_prompt() — a built-in function that does exactly this in a single call, so you do not have to type <= relux> every time and hard-code the prompt string in your tests.

Check the exit code

A command can produce the expected output and still fail. Or it can fail silently, producing no output at all, while the match picks up something else entirely. Checking the exit code after a command catches these problems early:

test "verify success" {
    shell s {
        > mkdir -p /tmp/test-dir
        <= relux>
        > echo ==$?==
        <= ==0==
        <= relux>
    }
}

The echo ==$?== / <= ==0== pair is a cheap way to verify the previous command succeeded. The == delimiters are distinctive enough to avoid accidental substring matches. Without this check, a failing mkdir would go unnoticed — the test would continue with a missing directory and fail later with a confusing, unrelated error.

Buffer reset does not respect causality

The buffer reset operator (<= with no pattern) consumes everything currently in the buffer. That “currently” depends on timing — how much output the shell has printed by the instant Relux executes the reset.

If output is still arriving — a command is running, a log line is being flushed — the cursor might land in the middle of a line, or before a line that is about to appear. This creates a race condition: the test might pass on your machine and fail in CI, or pass nine times and fail on the tenth.

In almost every case, there is a better anchor than a buffer reset. Match the prompt. Match a specific log line. Match any known text that marks the boundary you actually care about. These anchors are causal — they mean “this specific thing happened” — rather than temporal — “this is where the buffer happened to be at this moment.”

Only use buffer reset when you are certain all relevant output has already arrived and there is no meaningful boundary to match against.

Try it yourself

Here is a simple test — two echo commands, each followed by a match:

test "double echo" {
    shell s {
        > echo hello
        <= hello

        > echo hello
        <= hello
    }
}

Run this test in your head. For each of the four operations, trace the buffer contents and the cursor position. How many commands would actually be executed in the shell? Write down your predictions.

Then run the test with relux run and open the event log at relux/out/latest/index.html. Compare the event log against your predictions.

Now think about what the right way to write this test would be.


Next: Built-in Functions — meet match_prompt() and the full toolkit of built-in helpers

Built-in Functions

Previous: The Output Buffer

Functions

If you have used any programming language before, functions in Relux will feel familiar. A function is a named operation that you call by writing its name followed by parentheses. Some functions take arguments — values you pass inside the parentheses, separated by commas. Some take no arguments at all.

match_prompt()
match_exit_code(0)

The first line calls match_prompt with no arguments. The second calls match_exit_code with one argument: 0.

Relux ships with a set of built-in functions (BIFs) — functions provided by the runtime that you can use in any test without importing or declaring anything. This article covers the ones you need most often. The remaining built-in functions — string operations, random generation, default values, and system utilities — will be introduced in later articles alongside the language features they complement.

You can also define your own functions, which a later article will cover. For now, all the functions you will see are built-in.

Arity

In Relux, a function is identified by its name and its number of arguments. The number of arguments a function accepts is called its arity.

This means two functions can share the same name as long as they take different numbers of arguments. Relux treats them as separate functions. You will see this with match_not_ok shortly: match_not_ok() (arity 0) and match_not_ok(exit_code) (arity 1) are two distinct functions that do related but different things.

When the article refers to a specific function, it uses the notation name/arity — for example, match_not_ok/0 and match_not_ok/1. This is just a convention for documentation; you do not write it this way in your tests.

The match functions

The previous article established that matching the prompt after each command is the most important habit for writing reliable tests. It also showed a manual way to check the exit code. The match functions automate both of these patterns.

There are five match functions. We will build them up from the simplest to the most convenient, showing what each one does in terms of the operators you already know.

match_prompt() matches the shell prompt — the string configured in Relux.toml (default: relux> ). It is equivalent to:

<= relux>

That is all it does: a literal match for the prompt string. The advantage over writing <= relux> by hand is that match_prompt() always uses the prompt from your project configuration. If you change the prompt in Relux.toml, every match_prompt() call picks up the new value automatically. No find-and-replace across your test files.

Here is the test from the previous article, rewritten with match_prompt():

test "full consumption" {
    shell s {
        > echo hello
        <= echo hello
        <= hello
        match_prompt()
    }
}

The behavior is identical to matching <= relux> — the cursor advances past the prompt, leaving the buffer clean for the next command.

match_exit_code(code) verifies the exit code of the most recently executed command. It is equivalent to:

> echo ::$?::
<= ::0::
<= relux>

(Where 0 is whatever value you passed as the argument.)

It sends echo ::$?:: to the shell — $? is the POSIX variable that holds the exit code of the last command. The :: delimiters are there to prevent accidental substring matches. Then it matches the expected code and the prompt.

Notice that match_exit_code does not match the prompt before sending. It assumes the buffer has already been consumed up to the prompt — either by a previous match_prompt() call or by a manual <= relux>. If you call match_exit_code with unconsumed output still in the buffer, the cursor will scan past all of it to find ::code::. The function will succeed, but you will have skipped over output without examining it — the same problem as a buffer reset.

A typical usage pattern:

test "match_exit_code with zero" {
    shell s {
        > true
        match_exit_code(0)
    }
}

test "match_exit_code with 127 for missing command" {
    shell s {
        > relux_nonexistent_command_42
        match_exit_code(127)
    }
}

The first verifies that true exits with code 0. The second verifies that a nonexistent command exits with 127 — the standard “command not found” code.

Why skip match_prompt() before checking the exit code? Because match_exit_code is a building block. The higher-level functions below combine prompt matching and exit code checking into a single call.

match_ok() is the idiomatic way to assert that a command succeeded. It combines the two functions above:

match_prompt()
match_exit_code(0)

That is it: match the prompt (consuming the command’s output and leaving the buffer clean), then verify the exit code is zero. One function call replaces two, and it reads naturally: “match that the command was OK.”

Here is an example:

test "shell retains state after switching away" {
    shell a {
        > export MY_MARKER=from_a
        match_ok()
    }

    shell b {
        > echo "in shell b"
        <= in shell b
        match_ok()
    }

    shell a {
        > echo $MY_MARKER
        <= from_a
        match_ok()
    }
}

The export command produces no interesting output — you just need to know it succeeded. match_ok() handles that in one call: consume whatever output there was, verify exit code 0, leave the buffer clean for the next shell block. The other two shell blocks first match a specific piece of output, then use match_ok() to consume the rest and verify success.

match_not_ok() is the opposite of match_ok(): it asserts that the previous command failed — that its exit code is anything other than zero. Like match_ok, it matches the prompt first:

<= relux>
> echo ::$?::
# verify the exit code is not ::0::
<= relux>

Use it when you expect a command to fail but don’t care about the specific exit code:

test "match_not_ok after failing command" {
    shell s {
        > false
        match_not_ok()
    }
}

test "match_not_ok after command-not-found" {
    shell s {
        > relux_nonexistent_command_42
        match_not_ok()
    }
}

The first test uses false, which always exits with code 1. The second uses a nonexistent command (exit code 127). In both cases, match_not_ok() passes because the exit code is not zero.

match_not_ok(exit_code) is the arity-1 variant. It asserts that the command failed with a specific non-zero exit code. It matches the prompt first, then verifies that the exit code equals the given value — and that the value is not zero:

<= relux>
> echo ::$?::
# verify the exit code equals the argument AND is not ::0::
<= relux>

This is stricter than match_not_ok/0. If the command exits with a different non-zero code, the test fails. If the command succeeds (exit code 0), the test also fails — even if you passed 0 as the argument.

Use it when the specific failure mode matters:

test "command not found gives 127" {
    shell s {
        > relux_nonexistent_command_42
        match_not_ok(127)
    }
}

Here is a summary of all five match functions:

FunctionMatches prompt first?Then checks
match_prompt()Yes (that’s all it does)
match_exit_code(code)NoExit code equals code
match_ok()YesExit code is 0
match_not_ok()YesExit code is not 0
match_not_ok(code)YesExit code equals code and is not 0

Control character functions

The match functions deal with text — matching output and checking exit codes. But sometimes you need to send a keystroke that is not a printable character — interrupting a running process with Ctrl+C, closing a pipe with Ctrl+D, or suspending a job with Ctrl+Z. The control character functions send these signals to the shell:

FunctionKeySignal / Effect
ctrl_c()Ctrl+CSends SIGINT — interrupts the running foreground process
ctrl_d()Ctrl+DSends EOF — signals end of input, closing stdin
ctrl_z()Ctrl+ZSends SIGTSTP — suspends the foreground process
ctrl_l()Ctrl+LSends form feed — typically clears the terminal screen
ctrl_backslash()Ctrl+\Sends SIGQUIT — forcefully terminates the process

These functions take no arguments and send a single control byte to the shell.

Here is an example that interrupts a long-running command:

test "ctrl_c interrupts a running command" {
    shell s {
        > sleep 60
        ctrl_c()
        match_prompt()
    }
}

The test sends sleep 60 — a command that would run for a minute. Then ctrl_c() interrupts it, just like pressing Ctrl+C in a terminal. Finally, match_prompt() verifies the shell returned to the prompt, confirming the interrupt worked and the shell is ready for the next command.

Another common pattern is using ctrl_d() to close stdin on an interactive program:

test "ctrl_d sends eof to interactive program" {
    shell s {
        > cat
        > hello
        ctrl_d()
        <= hello
        match_ok()
    }
}

This starts cat, which reads from stdin and echoes back. The test sends hello, then closes stdin with ctrl_d(). The cat process exits, and the match picks up the echoed output. match_ok() verifies cat exited cleanly and consumes the prompt.

And ctrl_z() to suspend a process:

test "ctrl_z suspends a process" {
    shell s {
        > sleep 60
        ctrl_z()
        match_prompt()
        > kill %%
        match_ok()
    }
}

The sleep 60 command is suspended by ctrl_z(), returning control to the shell. Then kill %% terminates the suspended job (the %% is shell syntax for “the most recent background job”). match_ok() confirms the kill succeeded.

Utility functions

Beyond matching and control characters, Relux provides three utility built-in functions: sleep, log, and annotate.

sleep(duration) pauses execution for the given duration. The argument is a string like "1s", "500ms", or "2m":

test "wait for service startup" {
    shell s {
        > start-my-service &
        match_prompt()
        sleep("2s")
        > curl http://localhost:8080/health
        <= healthy
        match_ok()
    }
}

The test starts a service in the background, waits two seconds for it to initialize, then checks its health endpoint.

log(message) writes a message to the test’s event log — the same log you see in the HTML report at relux/out/latest/. It appears as a log event row in the event timeline, timestamped alongside sends and matches. This is useful for marking phases of a complex test, recording diagnostic information, or leaving notes for whoever reads the report after a failure.

log("about to start the server")

annotate(text) adds a label to the progress output — the compact |.... string you see in terminal output during a test run. Annotations appear inline as named markers, making it easier to see where a test is spending its time when watching a run in real time.

annotate("setup complete")

For example, a test with two annotations might produce progress output like this:

test my_test.relux/server-startup: |...[setup complete]....[server ready].. ok (2.1s)

The annotation text appears between the dots, marking the point in the test where it was called.

The distinction between log and annotate is where the output goes: log writes to the persistent HTML report, annotate writes to the live terminal progress line.


Next: Variables — store, transform, and reuse values with let and ${var}

Variables

Previous: Built-in Functions

So far, every value in the tests has been hardcoded — the command to send, the string to match, the exit code to check. That works for small examples, but real tests need to capture output, pass values between operations, and avoid repeating the same string in multiple places. Variables solve all of these problems.

All values in the Relux DSL are strings. There are no integers, booleans, lists, or other types — just strings. Every variable holds a string, every expression produces a string, every function argument is a string. This is a deliberate design choice that keeps the language simple: when your job is sending text to a shell and matching text coming back, strings are the only type you need.

Variable names must start with a lowercase letter or an underscore, followed by any combination of letters (upper or lower), digits, and underscores. Both snake_case and camelCase are valid names. Names starting with an uppercase letter are reserved for effects, which you will learn about in a later article.

Declaring variables with let

The let keyword declares a variable and optionally binds it to a value:

let name = "relux"
let count = 3
let empty

The first form, let name = "value", is the most common. It declares a variable and sets its value.

The second form, let count = 3, shows that literal numbers can go unquoted. Since all values are strings, 3 is stored as the string "3" — the quotes are optional for numbers. This is why built-in function calls like match_exit_code(1) work without quotes around the argument.

The third form, let empty, declares a variable with no value. It defaults to the empty string "". This is useful when you want to declare a variable early and assign it later.

String interpolation with ${var}

Once declared, a variable is referenced with the ${var} syntax. Relux replaces each ${var} with the variable’s current value before the operation executes:

test "interpolation basics" {
    shell s {
        let greeting = "hello"
        let target = "world"
        > echo "${greeting} ${target}"
        <= hello world
        match_ok()
    }
}

Interpolation works everywhere — in send operators, in literal match patterns, and in string expressions passed to functions.

If you reference a variable that has not been declared, it interpolates to the empty string — no error, no warning. The text simply disappears:

test "undefined variable is empty" {
    shell s {
        > echo "before${nonexistent}after"
        <= beforeafter
        match_ok()
    }
}

This is a deliberate design choice. It makes environment variable access seamless (you don’t know ahead of time whether a host variable exists), but it also means a typo in a variable name will silently produce an empty string rather than an error.

Everything has a value

Every expression in Relux produces a string value. This means let can capture the result of any expression — not just string literals.

Here is what each expression you have seen so far returns:

ExpressionReturns
"hello"The string itself: hello
> commandThe interpolated text that was sent
=> textThe interpolated text that was sent
<= patternThe pattern that was matched
match_prompt()The prompt string
match_ok()The prompt string
match_not_ok()The prompt string
match_exit_code(code)The prompt string
ctrl_c(), ctrl_d(), etc.Empty string
log(message)The message
annotate(text)The annotation text

Since every expression returns a value, you can capture any of them with let:

test "let from expressions" {
    shell s {
        > echo "status=ok"
        let matched = <= status=ok
        match_prompt()
        > echo "I matched: ${matched}"
        <= I matched: status=ok
        match_ok()
    }
}

The let matched = <= status=ok line does two things at once: it performs the literal match against the output buffer and stores the matched pattern text in the variable matched.

Reassignment

Once a variable is declared with let, you can change its value using the assignment operator = — without the let keyword:

test "reassignment" {
    shell s {
        let x = "before"
        > echo ${x}
        <= before
        match_prompt()

        x = "after"
        > echo ${x}
        <= after
        match_ok()
    }
}

The variable must have been declared with let first. Assigning to an undeclared variable is a runtime error:

test "assign without let fails" {
    shell s {
        # This line will cause a runtime error:
        # "assignment to undeclared variable `x`"
        x = "oops"
    }
}

You can reference the variable’s current value on the right-hand side of an assignment — the old value is read before the new one is written:

test "self-referencing assignment" {
    shell s {
        let x = "foo"
        x = "${x}bar"
        > echo ${x}
        <= foobar
        match_ok()
    }
}

Escaping: the $$ literal

Since ${...} triggers variable interpolation, you need an escape when you want a literal dollar sign. Relux uses $$ — two dollar signs produce one literal $ in the output.

This matters most when you need to send the literal text ${...} to the shell — for example, to reference a shell variable using brace syntax:

test "shell-side brace expansion via dollar escape" {
    shell s {
        > MY_SERVICE=api && echo "$${MY_SERVICE}_port"
        <= api_port
        match_ok()
    }
}

Without the $$, writing > echo "${MY_SERVICE}_port" would trigger Relux interpolation — Relux would look up a variable named MY_SERVICE, find nothing, and send echo "_port" to the shell. The shell would never see the $.

With $$, Relux produces the literal text ${MY_SERVICE}_port, sends it to the shell, and the shell performs its own variable expansion.

You can mix escapes with interpolation in the same expression:

test "dollar escape with variable interpolation" {
    shell s {
        let name = "USD"
        > echo "currency: $$${name}"
        <= currency: $USD
        match_ok()
    }
}

$$ produces $, and ${name} produces USD. The shell receives echo "currency: $USD".

Scoping

Variables in Relux exist at one of two levels: test scope and shell scope.

Test scope — variables declared outside any shell block, directly inside a test block. These are visible to all shells in the test:

test "test-level variable shared across shells" {
    let shared = "from-test"

    shell a {
        > echo "a=${shared}"
        <= a=from-test
        match_ok()
    }

    shell b {
        > echo "b=${shared}"
        <= b=from-test
        match_ok()
    }
}

Both a and b can see shared because it was declared at test level.

Shell scope — variables declared inside a shell block. These live in that shell’s scope and are not visible to other shells:

test "shell-scoped variable" {
    shell a {
        let local = "only-in-a"
        > echo ${local}
        <= only-in-a
        match_ok()
    }

    shell b {
        > echo "local='${local}'"
        <= local=''
        match_ok()
    }
}

The variable local is declared inside shell a. When shell b tries to reference it, ${local} interpolates to the empty string — it simply does not exist in b’s scope.

Shadowing

A shell-scoped variable with the same name as a test-scoped variable shadows it within that shell. The test-scoped value is unchanged and remains visible in other shells:

test "shadowing" {
    let x = "test-level"

    shell a {
        let x = "shadowed-in-a"
        > echo ${x}
        <= shadowed-in-a
        match_ok()
    }

    shell b {
        > echo ${x}
        <= test-level
        match_ok()
    }
}

Shell a declares its own x, which shadows the test-level x inside a. Shell b still sees the original test-level value.

Environment variables

Host environment variables — the ones you see with env or printenv in your terminal — are accessible through the same ${VAR} syntax as Relux variables:

test "access host environment variable" {
    shell s {
        > echo ${HOME}
        <= /
        match_ok()
    }
}

${HOME} is not a Relux variable — no let declared it. Relux checks its own variables first (shell scope, then test scope), and when it finds nothing, it falls through to the host process environment. This works for any environment variable set in the process that runs relux.

Environment variables are global — they are visible in every test, every shell block, every scope. And they are immutable — you cannot reassign them from within the Relux DSL.

A let with the same name creates a Relux variable that shadows the environment variable. However, Relux variable names must start with a lowercase letter or underscore, so uppercase environment variables like HOME or PATH cannot be shadowed — there is no valid Relux variable name that matches them. They are always readable and never obscured.

Environment variables that happen to use a compatible naming scheme (lowercase, snake_case) can be shadowed. In that case, the Relux variable takes priority within its scope, and the environment variable remains accessible in scopes where no shadow exists.

Relux environment variables

Relux injects several variables into every test run. These are real environment variables — every spawned shell process inherits them, so they are accessible both through ${VAR} in the Relux DSL and through standard shell expansion (e.g., echo $__RELUX_RUN_ID) inside the shell itself. You can pass them to scripts, programs, or any command launched from within the test.

  • ${__RELUX_RUN_ID} — the unique identifier for the current test run
  • ${__RELUX_TEST_ARTIFACTS} — the path to the run’s artifacts/ subdirectory (inside the run directory under relux/out/). This is a good place to store files related to the test run: generated configs, temporary databases, downloaded fixtures, or any other artifacts that should be preserved alongside the test logs.
  • ${__RELUX_SHELL_PROMPT} — the configured shell prompt string
  • ${__RELUX_SUITE_ROOT} — the absolute path to the project root (where Relux.toml lives)
  • ${__RELUX_TEST_ROOT} — the absolute path to the directory containing the current test file

Try it yourself

Write a test with two shells and the following behavior:

  1. Declare a test-level variable tag with a value like "build-42".
  2. In the first shell, use $$ to set a shell-side environment variable (with export) whose value comes from the Relux tag variable. Verify it was set by echoing it back through $$.
  3. In the second shell, verify that the shell-side export from the first shell is not visible (shells are independent processes), but the Relux tag variable is visible (test-scoped variables are shared).
  4. Back in the first shell, declare a shell-scoped let tag that shadows the test-level one. Verify the shadow is in effect, then switch to the second shell and verify the original test-level value is unchanged.

This exercise combines test-scoped variables, shell independence, $$ escaping, and shadowing — all the pieces from this article.


Next: Regex Matching — match output with regular expressions and extract captured values

Regex Matching

Previous: Variables

The previous articles introduced literal matching (<=) for checking that specific text appears in the output, and variables for storing and reusing values. But there is a gap between the two: how do you extract a part of the output and store it in a variable?

Consider a command that prints a version string like server v3.2.1 started on port 8080. With literal match, you can verify the whole string appeared — but you cannot pull out 3.2.1 or 8080 separately. You might need the port number to connect from another shell, or the version to include in a log message. Literal match gives you all-or-nothing: the entire matched text, or nothing at all.

Regex matching solves this. The <? operator matches output using a regular expression, and capture groups let you extract specific pieces of the matched text into numbered variables that you can use in subsequent operations.

Here is a test that extracts a date from command output and uses each part separately:

test "parse a date" {
    shell s {
        > echo "2026-03-08"
        <? ^(\d{4})-(\d{2})-(\d{2})$
        > echo "year=${1} month=${2} day=${3}"
        <? ^year=2026 month=03 day=08$
    }
}

This test comes from tests/relux/tests/variables/capture_groups.relux in the Relux source tree.

The <? operator matches the output against the regex pattern ^(\d{4})-(\d{2})-(\d{2})$. The three parenthesized groups capture the year, month, and day. After the match, $1, $2, and $3 hold those values — and they can be used in the next send, just like any other variable.

The <? operator

The regex match operator <? works like literal match (<=) in most ways: it scans forward from the cursor, waits up to the timeout for a match, and advances the cursor past the matched text when it succeeds. The difference is in how it interprets the pattern.

Where <= treats its payload as a plain substring to find, <? compiles it as a regular expression. The regex flavor is Rust’s regex crate — a Perl-compatible syntax without lookahead or backreferences, but with full support for character classes, quantifiers, anchors, alternation, and capture groups. Multi-line mode is enabled by default, so ^ and $ match the start and end of each line, not just the start and end of the entire buffer.

Like <=, the <? operator with an empty pattern acts as a buffer reset — it consumes everything currently in the buffer without matching anything specific.

A simple regex match looks almost identical to a literal match:

test "basic regex match" {
    shell s {
        > echo hello-relux
        <? ^hello-relux$
    }
}

Capture groups

Parentheses in a regex pattern create capture groups. When the match succeeds, each group’s matched text becomes available through a numbered variable: $1 for the first group, $2 for the second, and so on. $0 holds the full match — everything the regex matched, not just the groups.

Here is a test that shows all three levels — full match, first group, second group:

test "full match via capture group zero" {
    shell s {
        > echo "hello world"
        <? (hello) (world)
        > echo "full='${0}' first='${1}' second='${2}'"
        <? ^full='hello world' first='hello' second='world'$
    }
}

$0 is hello world (the entire matched text), $1 is hello, and $2 is world.

If you access a capture group that does not exist — say $5 when the regex only has one group — it resolves to the empty string, just like an undefined variable:

test "missing capture group returns empty string" {
    shell s {
        > echo "one=1"
        <? ^one=(\d+)$
        > echo "five='${5}'"
        <? ^five=''$
    }
}

Captures are replaced on every match

Each <? match replaces all capture groups from the previous match. If the first match produced $1 and $2, and the second match has only one group, $2 becomes empty — it does not retain its old value:

test "captures overwritten by next match" {
    shell s {
        > echo "key=abc val=xyz"
        <? ^key=(\w+) val=(\w+)$
        > echo "g1=${1} g2=${2}"
        <? ^g1=abc g2=xyz$
        > echo "only=one"
        <? ^only=(\w+)$
        > echo "g1=${1} g2='${2}'"
        <? ^g1=one g2=''$
    }
}

After the second <?, $1 is one and $2 is gone. The captures from the first match are completely discarded.

Saving captures to named variables

Because captures are replaced on every match, you can save a captured value into a named variable with let to keep it around:

test "capture into variable" {
    shell s {
        > echo "key=alpha"
        <? ^key=(\w+)$
        let saved = $1
        > echo "other=beta"
        <? ^other=(\w+)$
        > echo "saved=${saved} current=${1}"
        <? ^saved=alpha current=beta$
    }
}

let saved = $1 reads the current value of $1 (which is alpha) and stores it in a named variable. When the second match replaces captures, $1 becomes beta — but saved still holds alpha.

let with a regex match expression

You can combine let and <? in a single statement. When you write let result = <? pattern, Relux performs the match and assigns the return value to the variable. The return value of a regex match is the full match text — the same as $0:

test "let from match expression captures full match" {
    shell s {
        > echo "code=42"
        let result = <? code=(\d+)
        > echo "result='${result}' group='${1}'"
        <? ^result='code=42' group='42'$
    }
}

result gets code=42 (the full match), while $1 gets 42 (the first capture group). This is the same behavior as other expressions you have seen in the everything has a value table — <? returns the full match text, and let stores it.

Variable interpolation in patterns

Like all operators in Relux, <? supports variable interpolation in its pattern. Variables are resolved before the pattern is compiled as a regex:

test "interpolation in regex pattern" {
    shell s {
        let key = "version"
        > echo "version=42"
        <? ^${key}=(\d+)$
        > echo "captured ${1}"
        <? ^captured 42$
    }
}

The pattern ^${key}=(\d+)$ becomes ^version=(\d+)$ after interpolation.

Best practices

Use regex only when you need it

You might default to <? everywhere since it is strictly more powerful than <= — any literal match can be written as a regex. But regex matches are harder to read, easier to get wrong, and can match more than you intended.

Literal match <= is a simple substring search. It does exactly one thing and it is obvious what it matches. When you do not need capture groups, anchors, or wildcards, <= is the better choice. Reserve <? for when you genuinely need regex capabilities: extracting values, matching variable output, or anchoring to line boundaries.

Always save captures to named variables

Capture groups like $1 are convenient — you match a pattern, and the extracted value is right there. It is tempting to use $1 directly in several places without saving it to a named variable first.

The problem is not with the code as you write it today. The problem is with the code as someone changes it five years from now. Test code is still code — it evolves, gets refactored, gets extended. Capture groups are silently replaced on every <? match. If someone inserts a new regex match between your capture and its use — a perfectly reasonable edit — $1 now refers to something completely different. No error, no warning, just a test that fails in a confusing way that takes hours to debug.

Save the capture to a named variable immediately after the match, before doing anything else. Then use the named variable everywhere:

// Fragile — $1 can be silently replaced by a later edit:
<? ^port=(\d+)$
> curl http://localhost:${1}/health

// Durable — the port is safe no matter what happens next:
<? ^port=(\d+)$
let port = $1
> curl http://localhost:${port}/health

The named variable survives any number of subsequent matches. It makes the code self-documenting (the name port says more than $1), and it insulates the test from future edits.

Anchor your patterns

A regex without anchors will match anywhere in the remaining buffer — the echoed command, a fragment of the prompt, leftover output from a previous step. This is the same problem as with literal match, but worse, because regex metacharacters like . and * match more broadly.

Use ^ and $ to pin your match to a specific line:

// Might match the echoed command or something unexpected:
<? version=\d+

// Matches exactly one complete line:
<? ^version=\d+$

This does not mean you should anchor every pattern — sometimes a substring regex is what you need. But when you have a choice, anchoring is safer: it documents your intent and prevents accidental matches.

Be careful with interpolated regex patterns

Variable interpolation in <? patterns lets you define reusable regex fragments — declare a pattern once at the test level and use it in multiple matches. This is handy for repeated patterns like timestamps, UUIDs, or version strings.

The catch is that after interpolation, the variable’s value becomes part of the regex. If the value contains regex metacharacters — ., *, +, (, [, and so on — they are interpreted as regex syntax, not as literal text. A variable holding 192.168.1.1 does not match the literal IP address; the . matches any character, so it also matches 192X168Y1Z1.

When the variable comes from your own let and you know the value, this is fine — just be aware of what you are putting into the pattern. When the variable comes from captured output or an environment variable, the content is unpredictable and the regex may compile into something you did not intend, or fail to compile entirely.

Try it yourself

Write a test that does the following:

  1. Run a command that produces output with two key-value pairs on the same line — something like echo "host=db.local port=5432".
  2. Use a single <? with two capture groups to extract both values into $1 and $2.
  3. Immediately save both captures to named variables (let host = $1, let port = $2).
  4. Run another command that produces different output and match it with <? — this will overwrite the capture groups.
  5. Verify that the named variables still hold the original values by echoing them back and matching the result.

This exercise combines capture groups, the save-to-variable pattern, and the ephemeral nature of captures — all the pieces from this article.


Next: Functions — extract reusable test logic into named, parameterized functions

Functions

Previous: Regex Matching

The previous articles covered the core toolkit for interacting with a shell: sending commands, matching output, calling built-in functions, storing values in variables, and extracting data with regex. With these tools you can write any test — but you will quickly find yourself repeating the same sequences of operations across tests. A health check that sends an HTTP request and verifies the status code. A login sequence that types a username, a password, and waits for a prompt. A cleanup step that kills a background process.

Relux lets you extract these sequences into functions — named, reusable blocks of test logic that you define once and call from any test:

fn check_status(url) {
    check_status(url, "")
}

fn check_status(url, params) {
    check_status(url, params, 200)
}

fn check_status(url, params, expected) {
    > curl -s -o /dev/null -w "%{http_code}\n" "${url}${params}"
    <? ^${expected}$
    match_ok()
}

Three definitions of the same function with different numbers of parameters. You can call it three different ways:

check_status("http://localhost:8080/health")
check_status("http://localhost:8080/users", "?page=1")
check_status("http://localhost:8080/admin", "", 403)

One name, three levels of detail.

Defining a function

A function definition starts with the fn keyword, followed by a name, a parameter list in parentheses, and a body in braces:

fn greet() {
    > echo "hello from fn"
    <? ^hello from fn$
    match_ok()
}

This defines a function called greet that takes no arguments. Its body sends a command, matches the output, and consumes the prompt — the same operators you use directly in shell blocks.

Function names must be snake_case — lowercase letters, digits, and underscores. This is enforced by the parser. If you try to define a function with an uppercase name like CheckStatus, Relux will reject the file.

Parameters go inside the parentheses, separated by commas:

fn say(msg) {
    > echo "${msg}"
    <? ^${msg}$
    match_ok()
}

fn add_label(prefix, value) {
    > echo "${prefix}: ${value}"
    <? ^${prefix}: ${value}$
    match_ok()
}

Calling a function

If you have been following the series, you already know how to call functions — the syntax is the same as for built-in functions. Write the name, followed by arguments in parentheses:

test "call function with multiple arguments" {
    shell s {
        add_label("status", "ok")
    }
}

Function calls can only appear inside shell blocks. This makes sense once you understand the execution model: a function’s body contains shell operators like > and <? that need an active shell to operate on.

Arity-based dispatch

You saw arity with built-in functions — match_not_ok() and match_not_ok(code) are two separate functions that share a name. The same mechanism works for user-defined functions. Relux identifies every function by its (name, arity) pair, so you can define multiple versions with different parameter counts:

fn greet() {
    > echo "hello"
    <? ^hello$
    match_ok()
}

fn greet(name) {
    > echo "hello, ${name}"
    <? ^hello, ${name}$
    match_ok()
}

fn greet(name, title) {
    > echo "hello, ${title} ${name}"
    <? ^hello, ${title} ${name}$
    match_ok()
}

test "arity dispatch" {
    shell s {
        greet()
        greet("alice")
        greet("alice", "Dr.")
    }
}

Each call resolves to the definition with the matching number of parameters.

Arity dispatch is more powerful than it might first appear. Relux has no built-in branching or conditionals in function bodies. You cannot write if params == "" to check whether an argument was provided. Arity dispatch is the language’s answer to default parameters: instead of one function that checks for empty strings internally, you write multiple definitions at different arities and have the simpler ones delegate to the fuller one.

The check_status function from the opening is the canonical example. Each definition defaults exactly one parameter and delegates to the next arity up:

  • check_status/1 fills in an empty query string and delegates to check_status/2
  • check_status/2 fills in the default expected status code of 200 and delegates to check_status/3
  • check_status/3 does the actual work

This progressive chain means each default value appears exactly once. If the default expected code ever changes from 200 to something else, you update one line in check_status/2. No value is duplicated across definitions.

Notice that number literals like 200 and 403 are unquoted — since all values are strings, Relux accepts bare numbers and stores them as strings.

The caller’s shell

A function does not get its own shell. When you call a function from inside a shell block, the function’s body executes in the caller’s shell — the same PTY session that the call site is running in. Every > sends to that shell, every <? matches against that shell’s output buffer.

This means a function can see everything the caller has done to the shell:

fn check_shell_var() {
    > echo $$MY_VAR
    <? ^caller_state$
    match_ok()
}

test "function executes in caller shell" {
    shell s {
        > export MY_VAR=caller_state
        match_ok()
        check_shell_var()
    }
}

The test exports an environment variable in shell s, then calls check_shell_var(). The function runs in the same shell — it reads MY_VAR and finds caller_state. There is no argument passing or special plumbing; the function simply shares the caller’s PTY session.

Scoping

Functions share the caller’s shell, but they do not share the caller’s variables. The isolation goes both ways: the function cannot see the caller’s variables, and the caller cannot see the function’s variables. The only variables available inside a function are its own parameters, anything it declares with let, and environment variables.

fn try_read_secret() {
    > echo "secret='${secret}'"
    <? ^secret=''$
    match_ok()
}

test "function cannot see caller variables" {
    shell s {
        let secret = "caller-only"
        try_read_secret()
    }
}

The caller declares secret, but inside try_read_secret() the variable ${secret} resolves to the empty string. It does not exist in the function’s scope. If a function needs a value from the caller, it must be passed as an argument.

fn say(msg) {
    > echo "${msg}"
    <? ^${msg}$
    match_ok()
}

test "function variables do not leak to caller" {
    shell s {
        say("test")
        > echo "msg='${msg}'"
        <? ^msg=''$
    }
}

After say("test") returns, ${msg} in the caller’s scope resolves to the empty string. The parameter msg existed only inside the function.

fn shadow_x() {
    let x = "from-function"
    > echo "inside: x=${x}"
    <? ^inside: x=from-function$
    match_ok()
}

test "function let does not mutate outer variable" {
    shell s {
        let x = "outer"
        shadow_x()
        > echo "x=${x}"
        <? ^x=outer$
    }
}

The function’s let x creates a local variable within the function’s own scope. The caller’s x remains "outer" after the call.

The mental model is simple: scope isolation is bidirectional. The only shared state is the shell itself — the PTY session, the running processes, the shell-side environment variables.

Return values

The variables article introduced the idea that every expression produces a value. Functions follow the same principle: a function’s return value is the value of its last expression. If the caller does not capture it, the return value is silently discarded.

fn make_label(prefix, value) {
    "${prefix}:${value}"
}

test "expression as return value" {
    shell s {
        let label = make_label("key", "val")
        > echo "${label}"
        <? ^key:val$
    }
}

A bare string expression at the end of the body becomes the return value. A let statement also produces a value — the value it assigns — so a function ending with let returns that value too. An empty function returns the empty string.

A common pattern is a function that runs a command, matches the output with <?, and returns a captured value:

fn capture_version() {
    > echo "version=3.2.1"
    <? ^version=(.+)$
    let ver = $1
    match_ok()
    ver
}

test "capture return value from match" {
    shell s {
        let ver = capture_version()
        > echo "got=${ver}"
        <? ^got=3.2.1$
    }
}

The regex match sets $1 to 3.2.1. The function saves the capture, cleans up the shell with match_ok(), and returns the saved value as the last expression.

As later articles introduce new kinds of expressions, they will note what value each one returns — following the same pattern as the everything-has-a-value table.

Functions calling functions

Functions can call other functions. Return values chain naturally — each function captures the result of the one it called and builds on it:

fn make_prefix(tag) {
    "[${tag}]"
}

fn log_msg(tag, msg) {
    let pfx = make_prefix(tag)
    > echo "${pfx} ${msg}"
    <? ^\[${tag}\] ${msg}$
    match_ok()
}

test "nested function uses helper return value" {
    shell s {
        log_msg("WARN", "check this")
    }
}

log_msg calls make_prefix to build a formatted tag, then uses the result in a send. Each function has its own scope, so tag in make_prefix and tag in log_msg do not collide — even though they happen to have the same name.

Return values can chain through multiple levels:

fn depth_a(x) {
    let val = depth_b(x)
    "${val}-a"
}

fn depth_b(x) {
    let val = depth_c(x)
    "${val}-b"
}

fn depth_c(x) {
    "${x}-c"
}

test "nested function return value chains" {
    shell s {
        let result = depth_a("root")
        > echo "${result}"
        <? ^root-c-b-a$
    }
}

depth_a calls depth_b, which calls depth_c. Each function appends its suffix to the result. The final value, root-c-b-a, traces the entire call chain.

Best practices

Captures do not survive function calls

You might call a function that internally uses <? and expect the capture groups ($1, $2, …) to be available in the caller afterward. This seems reasonable — the function ran a regex match, and captures are normally available after <?.

But captures are part of the variable scope. When a function returns, its entire scope — including captures — is discarded. The caller’s captures are restored to whatever they were before the call:

fn extract_port() {
    > echo "port=8080"
    <? ^port=(\d+)$
    // The last expression is match_ok(), whose return value is the
    // prompt string — not the captured port number.
    match_ok()
}

test "captures do not survive function calls" {
    shell s {
        // Wrong — $1 holds the caller's capture state, not the function's:
        extract_port()
        > echo "port=${1}"
        <? ^port=8080          // $1 is empty

        // Also wrong — the return value is the prompt string, because
        // match_ok() is the last expression in extract_port():
        let result = extract_port()
        > echo "result=${result}"
        <? ^result=8080        // result is the prompt, not "8080"
    }
}

The fix is to design the function to explicitly return what you need. Save the capture to a local variable before calling match_ok(), then return that variable as the last expression:

fn extract_port() {
    > echo "port=8080"
    <? ^port=(\d+)$
    let port = $1
    match_ok()
    port
}

Now let port = extract_port() in the caller gives you "8080".

This is consistent with the scoping model: functions cannot modify the caller’s variable state. Return values are the explicit, reliable channel for passing data back.

Leave the shell clean

When a function interacts with the shell — sending commands and matching output — it should leave the shell in a known state before returning. That means: consume the prompt and verify the exit code with match_ok() (or the appropriate match_not_ok variant) after the last command.

# Leaves the shell in an unknown state — the caller must
# know what output is left in the buffer:
fn check_server() {
    > curl -s http://localhost:8080/health
    <= healthy
}

# Leaves the shell clean — prompt consumed, exit code verified:
fn check_server() {
    > curl -s http://localhost:8080/health
    <= healthy
    match_ok()
}

A function that leaves unconsumed output or an unchecked exit code forces every caller to clean up after it. That coupling is invisible and fragile — it works until someone adds a new caller that forgets, or the function’s output changes slightly. Close every shell interaction with a clean handoff.

Do not rely on a shared shell state

The caller and the function share a shell session. This means the function can read shell-side environment variables set by the caller, and the caller can read shell-side state left behind by the function. Both directions are tempting shortcuts — and both lead to brittle tests.

A function cannot predict the shell state of all its callers. Some callers have not been written yet. If a function depends on a shell-side variable that the caller must set beforehand, the requirement is invisible — nothing in the function signature or call site reveals it. Pass the value as a parameter instead.

In the other direction, a caller that depends on shell-side state set by a function is coupled to the function’s implementation details. If the function’s internals change — a different variable name, a different order of commands — the caller silently breaks.

If you genuinely cannot avoid relying on shared shell state, make it explicit with a comment at both the definition and call site explaining the dependency. But first, consider whether a parameter or return value would work instead.

Keep functions small

A function runs in the caller’s shell, so a long function body means a long sequence of sends and matches executing in someone else’s shell session. When something fails halfway through a 30-line function, the error points to a line inside the function — but understanding why it failed requires knowing what the caller’s shell looked like at the time of the call.

Prefer small functions that do one thing: check a status code, verify a service is running, send a login sequence. If you find a function growing beyond a handful of operations, consider splitting it into smaller pieces — so each has a clear, narrow purpose.

Try it yourself

Write a function run_and_capture with two arities:

  • run_and_capture(cmd) — runs a shell command and returns the first line of its output. Leaves the shell clean.
  • run_and_capture(cmd, pattern) — same, but uses a custom regex pattern instead of matching any line. The one-argument version delegates to this one.

Then write a test that exercises both arities and verifies the return values.


Next: Timeouts — control how long Relux waits for output, from individual matches to entire test suites

Timeouts

Previous: Functions

Every match operation in Relux has a timeout — a maximum duration to wait for the expected output to appear. If the output does not arrive in time, the test fails. So far, the tutorials have relied on the default timeout from Relux.toml without thinking about it. That works for simple cases, but real test suites need more control: some commands respond in milliseconds, others take seconds, and some tests must enforce strict time boundaries on the system under test.

Relux draws a sharp line between two kinds of timeout. A tolerance timeout (~) says “be patient for this long” — it absorbs environmental variability and scales with the --timeout-multiplier flag. An assertion timeout (@) says “the system must respond within this time” — it is a correctness check and never scales. The prefix determines the intent, not the position: both ~ and @ work at every level — config defaults, shell scope, inline overrides, and test definitions.

test "layered timeouts" @40s {
    shell s {
        ~10s
        > slow_startup_command
        <? ready

        @2s
        > fast_command
        <? done

        > very_slow_query --timeout 25
        <~28s? ^query complete$
    }
}

The config sets a default match timeout. Inside the shell, ~10s raises the tolerance timeout to 10 seconds for the startup command — if CI is slow, the multiplier can stretch this further. Then @2s switches to an assertion timeout: the fast_command must respond within 2 seconds regardless of environment. The final match uses <~28s? to set a one-shot tolerance override for just that operation. The test itself has @40s — an assertion that the entire test must complete within 40 seconds, multiplier or not.

Config defaults

The [timeout] section in Relux.toml controls three values:

[timeout]
match = "5s"
test = "5m"
suite = "10m"

match is the default timeout for every match operation — <=, <?, and their variants. When a match operator waits for output, this is how long it waits. Defaults to 5s if not specified.

test is the maximum duration for a single test. If a test exceeds this limit, Relux aborts it and reports a timeout failure. Defaults to 5m.

suite is the maximum duration for the entire test run. If the suite exceeds this limit, Relux aborts the remaining tests. Defaults to 10m.

All three config timeouts are tolerances — they are scaled by --timeout-multiplier.

--timeout-multiplier

Different environments run at different speeds. A test suite that passes in 2 seconds on a developer laptop might need 6 seconds on an overloaded CI server. Rather than hardcoding generous timeouts everywhere, Relux provides a multiplier:

relux run --timeout-multiplier 3.0
relux run -m 3.0

The multiplier scales every tolerance timeout (~) by the given factor. With -m 3.0 and a config of match = "5s", every match operation defaults to 15 seconds. A shell-scoped ~2s becomes 6 seconds. Config test and suite timeouts are scaled the same way.

Assertion timeouts (@) are never scaled. They express exact intent about the system under test — stretching them would weaken the assertion. If a test says @2s, the system must respond within 2 seconds whether you are running on a laptop or a loaded CI box.

The ~ operator

The ~ operator sets a tolerance timeout for the current shell, overriding the config default:

test "scoped timeout allows delayed output" {
    shell s {
        ~3s
        > sh -c 'sleep 1 && echo delayed'
        <? ^delayed$
    }
}

The ~3s sets the timeout to 3 seconds. Every match operation after it — <?, <=, and their variants — uses 3 seconds instead of the config default. The change persists until another timeout operator replaces it:

test "scoped timeout overrides previous timeout" {
    shell s {
        ~200ms
        ~3s
        > sh -c 'sleep 1 && echo delayed'
        <? ^delayed$
    }
}

The first ~200ms would be too short for the command, but the second ~3s replaces it before the match runs.

The ~ operator accepts milliseconds (~200ms), seconds (~3s), minutes (~2m), and compound durations (~1m30s).

Because ~ is a tolerance timeout, it is scaled by --timeout-multiplier. With -m 2.0, a ~3s becomes 6 seconds.

The @ operator

The @ operator sets an assertion timeout for the current shell. It works exactly like ~ in terms of scope and persistence, but it is never scaled:

test "assertion timeout in shell scope" {
    shell s {
        @2s
        > echo hello
        <? ^hello$
    }
}

The @2s sets a 2-second assertion timeout. Every match after it must be complete within 2 seconds — no multiplier adjustment, no environmental slack. Use @ when the time boundary is part of what you are testing: “the system must respond within X.”

You can switch between ~ and @ freely within a shell. Each one replaces the previous timeout, regardless of kind:

test "mixing tolerance and assertion" {
    shell s {
        ~3s
        > startup_command
        <? ready

        @1s
        > echo fast
        <? ^fast$

        ~5s
        > slow_command
        <? ^done$
    }
}

The startup match uses a 3-second tolerance. The echo fast match uses a 1-second assertion. The final match switches back to a 5-second tolerance.

Inline overrides

Sometimes a single operation needs a different timeout without changing the shell’s default. The <~ and <@ prefixes add a one-shot timeout to any match operator:

test "inline timeout overrides scoped timeout for regex" {
    shell s {
        ~200ms
        > sh -c 'sleep 1 && echo delayed_regex'
        <~3s? ^delayed_regex$
    }
}

The shell timeout is 200ms — far too short for a command that takes over 100 milliseconds. But <~3s? overrides the timeout for just this one match. The next match after it reverts to the 200ms shell timeout:

test "inline timeout is one-shot" {
    shell s {
        ~200ms
        > sh -c 'sleep 1 && echo delayed'
        <~3s? ^delayed$
        > echo immediate
        <? ^immediate$
    }
}

The <~3s? match waits up to 3 seconds. The <? ^immediate$ that follows uses the shell’s 200ms timeout — the override did not persist.

Both prefixes work with both match operators:

OperatorMeaning
<~[duration]?Regex match with tolerance override (scaled)
<~[duration]=Literal match with tolerance override (scaled)
<@[duration]?Regex match with assertion override (not scaled)
<@[duration]=Literal match with assertion override (not scaled)

The prefix only changes the timeout. Everything else about the operator stays the same — you can use captures, variable interpolation, and all other features exactly as before:

test "inline timeout with variable interpolation" {
    shell s {
        ~200ms
        let word = "interp_val"
        > sh -c 'sleep 1 && echo interp_val'
        <~3s? ^${word}$
    }
}

Use <@ when a single match is an assertion about response time:

test "assertion timeout inline regex match" {
    shell s {
        ~200ms
        > sh -c 'sleep 1 && echo assert_regex'
        <@3s? ^assert_regex$
    }
}

The <@3s? asserts the system responds within 3 seconds. The multiplier will not stretch it.

Test-level timeout

A test can declare its own timeout directly in the definition, using either prefix:

test "tolerance on test" ~30s {
    shell s {
        > echo hello
        <? ^hello$
    }
}

test "assertion on test" @3s {
    shell s {
        > echo hello
        <? ^hello$
    }
}

The ~30s is a tolerance — scaled by the multiplier, it says “be patient for 30 seconds.” The @3s is an assertion — never scaled, it says, “this test must complete within 3 seconds or the system is broken.”

Consider testing Relux’s own timeout mechanism. You want to verify that a shell-level timeout of 1 second actually fires:

test "shell timeout fires within bound" @5s {
    shell s {
        ~1s
        > sleep 999
        <? ^this will never appear$
    }
}

The inner ~1s timeout should fire after 1 second when the match fails. The outer @5s test timeout is the assertion: if 5 seconds pass and the inner timeout somehow did not fire, the system is broken. Without the test-level assertion timeout, a bug in the timeout mechanism would cause the test to hang forever.

If neither prefix is used on the test definition, the config test timeout applies (default: 5m). A test-level timeout — whether ~ or @ — overrides the config value.

Timeout scoping across function calls

When you call a function, the function inherits the caller’s current timeout. When the function returns, the timeout reverts to what the caller had before the call:

fn slow_operation() {
    ~10s
    > long_running_command
    <? ^done$
    match_ok()
}

test "timeout reverts after function call" {
    shell s {
        ~2s
        slow_operation()
        # Back to 2s here — the function's ~10s did not persist
        > echo quick
        <? ^quick$
    }
}

The caller sets ~2s. Inside slow_operation(), ~10s changes the timeout — but only within the function’s scope. When the function returns, the caller’s 2-second timeout is restored.

The timeout lives on the shell — it is part of the shell’s own state, like the output buffer or the running processes. Reverting the timeout on function return prevents accidental side effects: a function can adjust the timeout for its own operations without forcing the caller to save and restore the previous value manually.

If a function does not set its own timeout, it uses whatever the caller had:

fn check_output() {
    > echo test
    <? ^test$
    match_ok()
}

test "function inherits caller timeout" {
    shell s {
        ~10s
        check_output()
        # check_output used the 10s timeout for its match
    }
}

This scoping applies equally to ~ and @ timeouts. A function that sets @1s does not change the caller’s timeout kind when it returns — the caller gets back exactly what it had, whether that was a tolerance or an assertion.

Precedence

When a match operation runs, Relux resolves the timeout using this precedence chain:

PrioritySourceExampleScaled by -m?
1 (highest)Inline tolerance<~3s? patternYes
1 (highest)Inline assertion<@3s? patternNo
2Shell scope tolerance~2sYes
2Shell scope assertion@2sNo
3 (lowest)Config defaultmatch = "5s"Yes

The first one that applies wins. If there is no inline override, the shell scope is used. If no ~ or @ has been set, the config default applies.

Separately, the test-level timeout (test "name" ~5s or test "name" @3s) and the config test/suite timeouts operate as outer boundaries — they cap the total duration of a test or run, independent of which match timeout is in effect.

Best practices

Use the multiplier for CI flakiness, not longer timeouts

When tests start failing on CI but pass locally, the tempting fix is to increase the timeouts in the test files. A ~2s becomes ~5s, then ~10s, and soon every test has generous timeouts that mask real performance regressions.

The multiplier exists for this problem. Keep your timeouts tight — reflecting how fast the system should respond — and use -m 2.0 or -m 3.0 on slow environments. This way, timeouts still catch genuine slowdowns on the developer’s machine while tolerating CI variability.

Choose the prefix, not the position

The ~ vs @ prefix is what determines whether a timeout is environmental tolerance or a system assertion. Both prefixes work at every level — shell scope, inline override, and test definition. Ask yourself: “is this about the environment or about the system?”

  • The CI server is slow → use ~ (tolerance), let -m scale it
  • One specific command is slower than the rest → use ~ with a larger value, or <~ on the match
  • The system must respond within 2 seconds → use @2s or <@2s?
  • The entire test must complete within a bound → use test "name" @5s

Reserve @ for real assertions

If you put @ on everything, the multiplier becomes useless — nothing scales, and slow environments fail. Use @ only when the time boundary is genuinely part of what you are testing. Most timeouts in a typical test suite should be ~ tolerances, with @ reserved for the few cases where timing is the assertion.

Try it yourself

Write a test that exercises both kinds of timeout:

  1. Use ~ to set a shell-scoped tolerance timeout long enough for a sleep 0.5 && echo done command
  2. Add an @ assertion timeout on the test definition — the whole test must finish within a strict bound
  3. Add a second match using <@ with an inline assertion timeout for a fast command
  4. Run the test, then try adding -m 0.5 to halve the tolerance timeouts — notice which timeouts shrink and which stay fixed

Next: Fail Patterns — continuous monitoring for errors with !? and !=

Fail Patterns

Previous: Timeouts

So far, every check in a test has been explicit: you send a command, then match the output you expect. But what about output you don’t expect? An ERROR buried in a log stream, a Segfault from a crashing service, a PANIC from an unhandled exception — these can appear at any point, and you can’t predict exactly when. Writing a match for every line of output just to catch them would be impractical.

Fail patterns solve this. They set up a background monitor on a shell’s output: if the pattern ever appears, the test fails immediately — no matter where you are in the test. Think of them as a tripwire stretched across the output stream.

test "service stays healthy" {
    shell server {
        !? FATAL|ERROR|panic
        > start-my-service --foreground
        <? listening on port 8080
    }

    shell client {
        > curl http://localhost:8080/health
        <? 200 OK
        match_prompt()
    }
}

The !? on line 3 sets a regex fail pattern on the server shell. From that point forward, every piece of output from that shell is checked against FATAL|ERROR|panic. If any of those strings appear — in the service’s startup logs, in background output while the client shell runs its health check, anywhere — the test fails on the spot. The match operators check for output you expect; the fail pattern watches for output you don’t.

Regex fail patterns with !?

The !? operator sets a regex fail pattern:

shell s {
    !? [Ee][Rr][Rr][Oo][Rr]
    > echo "all good"
    <? ^all good$
    match_prompt()
}

The pattern [Ee][Rr][Rr][Oo][Rr] is a regular expression — the same regex syntax you use with <?. This one matches “error” in any mix of upper and lower case. As long as the shell’s output doesn’t contain a match, the test proceeds normally. The moment it does, the test fails.

Relux checks the fail pattern every time a new piece of shell output arrives in the output buffer. As the shell prints data — command output, log lines, error messages — each chunk is checked against the active fail pattern before anything else happens.

When a fail pattern matches, Relux reports exactly what triggered it — the pattern, the matched text, and the shell name — so you can diagnose the problem quickly.

Literal fail patterns with !=

If your error string doesn’t need regex, use != for a literal (substring) match:

shell s {
    != FATAL ERROR
    > echo "all good"
    <? ^all good$
    match_prompt()
}

This watches for the exact substring FATAL ERROR in the output. No regex interpretation — dots, brackets, and other special characters are matched literally. Use != when the string you’re watching for contains regex metacharacters and you don’t want to escape them, or when you simply don’t need pattern matching.

Both !? and != behave identically in every other way: same checking points, same single-slot rule, same scoping.

One pattern at a time

Each shell has a single fail pattern slot. Setting a new fail pattern — whether regex or literal — replaces whatever was there before:

shell s {
    !? first_pattern
    !? second_pattern
    > echo "first_pattern is fine now"
    <? ^first_pattern is fine now$
    match_prompt()
}

After line 3, only second_pattern is active. The first pattern is gone. This test passes because first_pattern in the output no longer triggers a failure.

The replacement works across types too. A != replaces a !?, and vice versa:

shell s {
    !? first_pattern
    != second_pattern
    > echo "first_pattern is fine now"
    <? ^first_pattern is fine now$
    match_prompt()
}

Fail patterns do not stack. There is always at most one active fail pattern per shell.

Immediate buffer rescan

When you set a fail pattern, Relux doesn’t just watch for future output — it immediately rescans the existing output buffer for the new pattern. If the buffer already contains a match, the test fails right then.

This matters for ordering. Consider:

shell s {
    > echo "ERROR: something went wrong"
    <= something went wrong
    match_prompt()
    !? ERROR
}

The <= on line 3 scans forward from the cursor and finds something went wrong in the echoed command — consuming everything up to and including that first occurrence. But the actual command output ERROR: something went wrong is still in the buffer, unconsumed. When !? is set on line 5, Relux rescans the buffer and finds ERROR in that remaining output. The test fails.

The takeaway: set your fail pattern before generating output that might match it. The natural place is at the top of a shell block.

Variable interpolation

Fail pattern payloads support variable interpolation, just like other operators:

shell s {
    let bad = "PANIC"
    !? ${bad}
    > echo "no panic here"
    <? ^no panic here$
    match_prompt()
}

The pattern is interpolated at the moment the !? statement executes. After interpolation, the resulting string is compiled as a regex (for !?) or used as a literal substring (for !=).

Watch out with !?: if the interpolated variable contains regex metacharacters like ., *, (, or [, they become part of the compiled pattern. A variable holding error (fatal) would be compiled as a regex where the parentheses create a capture group, not a literal match for (fatal). If the value might contain special characters, use != instead.

Clearing fail patterns

A bare !? or != with no payload clears the active fail pattern:

shell s {
    !? BOOM
    > echo safe
    <? ^safe$
    match_prompt()
    !?
    > echo BOOM
    <? ^BOOM$
    match_prompt()
}

Line 2 sets the fail pattern. Lines 3–5 work normally under its protection. Line 6 clears it — from this point on, there is no active fail pattern. Lines 7–9 can safely produce BOOM without triggering a failure.

Either !? or != can clear the pattern, regardless of which type was used to set it. They both clear the same single slot.

Scoping across function calls

Fail patterns follow the same scoping rule as timeouts: a function inherits the caller’s fail pattern, but any changes the function makes are reverted when it returns.

fn set_fail_pattern_inside() {
    !? BOOM
    > echo "in fn"
    <? ^in fn$
    match_prompt()
}

test "fail pattern set inside function does not persist in caller" {
    shell s {
        set_fail_pattern_inside()
        > echo "BOOM is safe now"
        <? ^BOOM is safe now$
        match_prompt()
    }
}

Inside set_fail_pattern_inside, the fail pattern BOOM is active — if the function’s own echo had produced BOOM, the test would fail. But after the function returns on line 10, the caller’s original state is restored (no active fail pattern in this case). The echo "BOOM is safe now" on line 11 is safe.

This means functions can set up their own fail patterns for internal safety without polluting the caller’s monitoring. It also means a caller’s fail pattern protects the function’s execution — the function inherits it automatically.

Best practices

Set fail patterns early

Place your !? or != as the first statement in a shell block, before any commands. This maximizes coverage — the pattern is active from the very first command output. A fail pattern set after several commands has no protection over the output those commands already produced (the immediate rescan will catch it if it’s in the buffer, but that turns a background monitor into a retroactive check, which is harder to reason about).

Use fail patterns for long-running services

Fail patterns are at their most valuable when testing long-running services that produce logs you don’t exhaustively match on. A web server, a database, a background worker — these emit output continuously, and you only match the specific lines that tell you the service is ready or responding correctly. A fail pattern like !? FATAL|panic|Segfault acts as a safety net across all that unmatched output. You focus your <= and <? operators on expected behavior; the fail pattern catches unexpected crashes in the background.

Don’t use fail patterns as assertions

Fail patterns are background monitors, not replacements for match operators. If you expect specific output, use <= or <? to match it. If you want to ensure something doesn’t appear, that’s what fail patterns are for. The distinction matters: match operators advance the output buffer cursor and participate in the test’s flow; fail patterns operate silently in the background and only surface when something goes wrong.

Combine multiple error strings with regex alternation

Since each shell has only one fail pattern slot, setting a second !? replaces the first. If you need to watch for multiple error patterns, combine them into a single regex using alternation:

shell s {
    !? ERROR|PANIC|FATAL|Segfault
    > start-my-service
    <? ready
    match_prompt()
}

Do not write:

shell s {
    !? ERROR
    !? PANIC
    !? FATAL
    > start-my-service
    <? ready
    match_prompt()
}

Only FATAL is active after line 4 — the first two patterns are gone.

Try it yourself

Write a test that starts a simulated service and monitors it for errors:

  1. Create a shell block and set a fail pattern that watches for ERROR, FATAL, and PANIC using a single regex alternation
  2. Use echo to simulate several lines of normal service output (startup messages, connection logs) and match key lines with <= or <?
  3. Clear the fail pattern, then echo a line containing ERROR — verify the test still passes because the pattern was cleared
  4. As a bonus: extract the fail pattern setup into a function. Verify that the pattern is active inside the function but does not persist after the function returns

Next: Effects and Dependencies — reusable test infrastructure with dependency graphs and overlay variables

Effects and Dependencies

Previous: Fail Patterns

The previous articles covered everything you need to test a single program in a single shell: sending commands, matching output, reusable functions, timeouts, and fail patterns. For a self-contained CLI tool, that is enough. But most real systems do not run in isolation.

Consider a web service that depends on a database and a message queue. Before you can test the service, the database needs to be running and migrated, the queue needs to be up, and maybe you want to tail the service’s logs in a separate shell with a fail pattern watching for crashes. Every test that exercises this service needs all of that infrastructure in place.

Without effects, you would set up everything manually in each test:

fn start_db() {
    > start-db --data-dir /tmp/test-db
    <? listening on port 5432
    match_prompt()
}

fn run_migrations() {
    > migrate --db localhost:5432
    <? migrations complete
    match_prompt()
}

test "user signup" {
    shell db {
        start_db()
        run_migrations()
    }
    shell svc {
        > start-my-service --db localhost:5432
        <? ready on :8080
    }
    shell client {
        > curl -s http://localhost:8080/signup -d 'user=alice'
        <? 201 Created
        match_prompt()
    }
}

test "user login" {
    shell db {
        start_db()
        run_migrations()
    }
    shell svc {
        > start-my-service --db localhost:5432
        <? ready on :8080
    }
    shell client {
        > curl -s http://localhost:8080/login -d 'user=alice'
        <? 200 OK
        match_prompt()
    }
}

Two tests, and the database and service startup is already duplicated. Functions reduce some repetition, but they run in the caller’s shell — they cannot spin up separate, independent services declaratively. And there is no way to share a running database across tests or control the teardown order when things go wrong.

Effects solve this. An effect is a named, reusable piece of test infrastructure. You define it once — what to start, how to verify it is ready — and each test declares what it needs. Relux resolves the dependency graph, starts everything in the right order, and tears it down when the test is done:

effect Db {
    expose service

    shell service {
        > start-db --data-dir /tmp/test-db
        <? listening on port 5432
        match_prompt()
    }
}

effect MigratedDb {
    start Db as db
    expose db.service as service

    shell migrations {
        > migrate --db localhost:5432
        <? migrations complete
        match_prompt()
    }
}

test "user signup" {
    start MigratedDb
    shell svc {
        > start-my-service --db localhost:5432
        <? ready on :8080
    }
    shell client {
        > curl -s http://localhost:8080/signup -d 'user=alice'
        <? 201 Created
        match_prompt()
    }
}

test "user login" {
    start MigratedDb
    shell svc {
        > start-my-service --db localhost:5432
        <? ready on :8080
    }
    shell client {
        > curl -s http://localhost:8080/login -d 'user=alice'
        <? 200 OK
        match_prompt()
    }
}

The infrastructure is defined once. Each test says start MigratedDb — Relux figures out that MigratedDb depends on Db, starts both in order, and hands the test a shell with a fully migrated database.

A particularly common pattern is monitoring: tail a log file with a fail pattern so any crash in the background aborts the test immediately. This combination is useful enough to deserve its own alias — call it a fail tail:

effect FailTail {
    expect FAILTAIL_TRIGGER, FAILTAIL_LOG
    expose tail

    shell tail {
        !? ${FAILTAIL_TRIGGER}
        > tail -f ${FAILTAIL_LOG}
    }
}

test "service handles load" {
    start FailTail {
        FAILTAIL_TRIGGER = "panic|error"
        FAILTAIL_LOG = "/var/log/service.log"
    }
    start Service as svc
    shell client {
        > curl http://localhost:8080/heavy-endpoint
        <? 200 OK
        match_prompt()
    }
}

The FailTail effect declares two required variables with expect and exposes its tail shell. It starts tailing the log and sets a fail pattern. If anything fatal appears in the log while the test runs its requests, the test fails on the spot. The { FAILTAIL_TRIGGER = ... } syntax passes configuration into the effect — we will cover these overlay variables later in this article. Without effects, you would duplicate this tail-and-fail-pattern setup in every test that exercises the service.

Defining an effect

An effect definition starts with the effect keyword, followed by a CamelCase name and a body in braces. Inside, expose declares which shells are part of the effect’s public interface:

effect Service {
    expose service

    shell service {
        > echo "service ready"
        <? ^service ready$
        match_prompt()
    }
}

The name must be CamelCase — this is how Relux distinguishes effects from functions, which are always snake_case. The expose service declaration means the service shell is available to whoever starts this effect. The shell block inside the body runs the setup: whatever commands are needed to get the service into a ready state.

The exposed shell is the bridge between the effect and the test. When a test starts this effect with an alias, it can access the service shell via dot-access — the same PTY session, in the same state it was left after setup. Environment variables set during setup, working directory changes, running processes — all persist into the test.

An effect body can contain expect declarations, expose declarations, let declarations, start statements, shell blocks, and a cleanup block. The shell blocks execute in order, and the shells named in expose declarations are accessible to callers. Shells that are not exposed are internal — they run during setup and are terminated when setup completes. Only exposed shells survive into the test body.

Starting an effect

A test declares its infrastructure requirements with the start keyword:

test "effect sets up shell before test runs" {
    start Service as svc
    shell svc.service {
        > echo "test using effect shell"
        <? ^test using effect shell$
    }
}

The start Service as svc does two things: it ensures the Service effect runs before the test body, and it makes the effect’s exposed shells available under the alias svc. Inside the test, shell svc.service { ... } accesses the shell that the effect exposed — the same PTY session that the effect’s shell service block used during setup, with all the state from setup intact.

The as alias names the effect instance within your test. You access its exposed shells via dot-access: shell alias.shell_name { ... }. This is useful when you start multiple effects:

effect ServiceA {
    expose service

    shell service {
        > export SVC_ID=A
        match_ok()
        > echo "service A ready"
        <? ^service A ready$
    }
}

effect ServiceB {
    expose service

    shell service {
        > export SVC_ID=B
        match_ok()
        > echo "service B ready"
        <? ^service B ready$
    }
}

test "two effects both accessible via alias" {
    start ServiceA as a
    start ServiceB as b
    shell a.service {
        > echo $$SVC_ID
        <? ^A$
    }
    shell b.service {
        > echo $$SVC_ID
        <? ^B$
    }
}

Both effects expose a shell called service, but the test accesses them through different aliases — a.service and b.service. The alias disambiguates which effect instance you mean.

Bare start

Sometimes you need an effect for its side effects — creating files, setting up external state, or just running the service in the background — but do not need access to its shells. Use start without as:

effect Scaffold {
    expose setup

    shell setup {
        > touch /tmp/side-effect-marker
        match_ok()
    }
}

test "bare start runs effect but does not expose shell" {
    start Scaffold
    shell s {
        > test -f /tmp/side-effect-marker && echo "effect ran"
        <? ^effect ran$
    }
}

The effect runs — the file gets created — but the test cannot access the effect’s shell because there is no alias to qualify with. shell s creates a fresh local shell. Use bare start when you care about what the effect does, not the shells it leaves behind.

Dependencies between effects

Effects can depend on other effects using start inside the effect body. This lets you build layered infrastructure where each piece builds on what came before:

effect Db {
    expose service

    shell service {
        > export DB_STATUS=started
        match_ok()
    }
}

effect MigratedDb {
    start Db as db
    expose db.service as service

    shell service {
        > export MIG_STATUS=applied
        match_ok()
    }
}

effect SeededDb {
    start MigratedDb as db
    expose db.service as service

    shell service {
        > export SEED_STATUS=seeded
        match_ok()
    }
}

This creates a dependency chain: SeededDb starts MigratedDb, which starts Db. When a test starts SeededDb, Relux resolves the full chain and executes in topological order — dependencies first:

  1. Db runs, exposes its service shell
  2. MigratedDb runs in that same shell (via start Db as db), adds migration state
  3. SeededDb runs in the same shell again, adds seed data

Each effect re-exposes the service shell from its dependency using the qualified expose syntax expose db.service as service. This means whoever starts SeededDb can access the same shell that was built up through the entire chain.

test "transitive dependencies execute in order" {
    start SeededDb as db
    shell db.service {
        > echo $$DB_STATUS
        <? ^started$
        > echo $$MIG_STATUS
        <? ^applied$
        > echo $$SEED_STATUS
        <? ^seeded$
    }
}

The test only says start SeededDb — it does not need to know about Db or MigratedDb. Relux resolves the transitive dependencies automatically. All three environment variables are present because all three effects ran, in order, on the same shell.

Circular dependencies are caught at check time. If effect A needs B and B needs A, relux check reports the cycle before any test runs.

Effect identity and deduplication

What happens when two tests — or two start statements in the same test — request the same effect? Relux does not run it twice. It identifies each effect instance by its identity and deduplicates: if the identity matches, the effect runs once and all references share the same instance.

For effects without overlay variables (covered in the next section), the identity is simply the effect name. Two start statements for the same effect share one instance — the effect runs once, and both aliases point to the same shell:

effect Counter {
    expose counter

    shell counter {
        > export COUNT=0
        match_ok()
        > COUNT=$$(($$COUNT + 1)) && echo $$COUNT
        <? ^1$
    }
}

test "same effect started twice shares one instance" {
    start Counter as c1
    start Counter as c2
    shell c1.counter {
        > echo $$COUNT
        <? ^1$
    }
    shell c2.counter {
        > echo $$COUNT
        <? ^1$
    }
}

Both c1 and c2 are aliases for the same effect instance. The Counter effect ran once — the counter was incremented to 1. If it had run twice, the count would be 2.

Overlay variables

So far, every effect has been a fixed recipe — the same setup every time. But what if you need two databases with different names, or the same service on different ports? The FailTail example in the introduction hinted at the answer: expect declares what the effect requires, and the { FAILTAIL_TRIGGER = ... } syntax at the start site provides it. These are overlay variables — key-value pairs passed at the start site that parameterize the effect:

effect Labeled {
    expect LABEL
    expose service

    shell service {
        > export SVC_LABEL=${LABEL}
        match_ok()
    }
}

test "different overlays create separate instances" {
    start Labeled as a {
        LABEL = "alpha"
    }
    start Labeled as b {
        LABEL = "beta"
    }
    shell a.service {
        > echo $$SVC_LABEL
        <? ^alpha$
    }
    shell b.service {
        > echo $$SVC_LABEL
        <? ^beta$
    }
}

The Labeled effect declares expect LABEL — a required variable that must be provided by the caller. Each start site provides its own value for LABEL, and Relux creates separate instances of the effect — one with LABEL = "alpha", another with LABEL = "beta". Each instance gets its own shell, its own setup run, its own state. If a caller forgets to pass a required variable, relux check reports the error before any test runs.

expect is a contract, not a sandbox. It declares which variables the effect requires — the ones the resolver validates. It does not prevent the effect from reading other variables. An effect always inherits the full parent environment: the base system environment, plus any variables set in the caller’s scope. The overlay adds to or overrides specific entries in that inherited environment. This means most configuration flows through naturally, and only the values that vary per-instance need to be listed in expect and passed via overlays.

This is where overlays connect to deduplication. The full identity of an effect instance is (effect name, evaluated overlay values). Same name with same overlay = shared instance. Same name with different overlay = separate instances. Two start Labeled as x { LABEL = "alpha" } with the same overlay value would share one instance, regardless of the alias name.

When the overlay key and the variable being passed have the same name, you can use the shorthand syntax — a bare key without = value:

let LABEL = "alpha"
start Labeled as a { LABEL }   // desugars to LABEL = LABEL

Overlay variables are the mechanism for reusing a single effect definition across different configurations — like the FailTail example from the introduction, where the trigger pattern and log path are passed in as overlays.

Section ordering

The parser enforces a fixed ordering of sections inside an effect body:

  1. expect — required overlay variables
  2. let — local bindings (can reference expected vars)
  3. start — sub-dependencies (overlay expressions can reference let-bound vars)
  4. expose — which shells are visible to callers
  5. shell blocks — setup logic
  6. cleanup — teardown (optional, at most one)

Each section is optional, but they must appear in this order. Writing a start before a let, or an expose before a start, is a parse error. Comments and blank lines are allowed anywhere between sections.

This ordering reflects the data flow: expects declare what is available, lets compute derived values, starts wire those values into sub-dependencies, and exposes declare the public interface after all shells and dependencies are established.

Best practices

Set fail patterns early in effects

Effects that start long-running services should set a fail pattern before the startup command, just like in a regular shell block. This maximizes coverage — any crash output during startup or during the test body triggers an immediate failure:

effect Service {
    expose service

    shell service {
        !? FATAL|ERROR|panic
        > start-my-service --foreground
        <? listening on port 8080
    }
}

The fail pattern is active from the first line. If the service crashes during startup, the fail pattern catches it before the readiness match even runs.

Deduplication and shared state

Because deduplication means two aliases can point to the same shell, mutations through one alias are visible through the other. This is by design — it is how effects like the database chain work, where each layer builds on the state left by the previous one. But it means you should be aware: if two unrelated parts of a test both alias the same effect instance, they share a single PTY session. Commands sent through one alias affect the shell the other alias sees.

If you need truly independent instances, give them different overlay values — even a dummy key is enough to create separate identities:

start MyEffect as a { INSTANCE = "1" }
start MyEffect as b { INSTANCE = "2" }

Try it yourself

Write a two-effect dependency chain that simulates a database setup:

  1. Define an effect Db that exposes a shell service, sets an environment variable DB_STATUS=running, and echoes a readiness message
  2. Define an effect MigratedDb that starts Db, re-exposes its shell, and sets MIG_STATUS=done
  3. Write a test that starts MigratedDb and verifies both variables are present via dot-access
  4. As a bonus: use overlay variables to create two database instances with different DB_NAME values, and verify each instance has its own name

Next: Pure Functions — functions that compute values without touching a shell

Pure Functions

Previous: Effects and Dependencies

In the functions article, you learned to extract reusable test logic into named functions. Those functions work well for sequences of shell operations — sending commands, matching output, consuming prompts. But they have a limitation: they can only be called inside a shell block, because their bodies contain shell operators that need an active PTY session to run in.

That restriction becomes frustrating when you need to compute a value before a shell block exists. Suppose you write a helper that builds a URL:

fn format_url(host, port) {
    "${host}:${port}/api"
}

You want to use it in a test-scope let to prepare a configuration value before any shell starts:

test "connect to API" {
    let url = format_url("localhost", "8080")
    shell s {
        > curl ${url}
        <? ^200 OK$
        match_prompt()
    }
}

This does not work. format_url is a regular function, and regular functions require a shell context. The let on line 2 sits outside any shell block, so Relux has no shell to execute the function in.

The same problem appears in other places. You cannot call a regular function in an effect-scope let, and you cannot use one in an overlay value for a start declaration. Anywhere outside a shell block, regular functions are off limits.

Pure functions solve this. Add the pure keyword before fn, and the function becomes shell-independent — callable from anywhere:

pure fn format_url(host, port) {
    "${host}:${port}/api"
}

test "connect to API" {
    let url = format_url("localhost", "8080")
    shell s {
        > curl ${url}
        <? ^200 OK$
        match_prompt()
    }
}

The test-scope let now works. The pure keyword tells Relux that this function operates on strings only and never touches a shell. In exchange for that restriction, it can be called from any expression context in the language.

Several built-in functions are also available in pure context:

FunctionDescription
trim(s)Strip leading/trailing whitespace
upper(s)Convert to uppercase
lower(s)Convert to lowercase
replace(s, from, to)Replace all occurrences
split(s, sep, idx)Split and return the Nth element
len(s)String length
uuid()Generate a UUID
rand(n) / rand(n, mode)Generate random values
available_port()Find a free TCP port
which(cmd)Locate a command on PATH
default(a, b)Return a if non-empty, else b

The pure fn syntax

A pure function definition looks like a regular function with the pure keyword prepended:

pure fn tag(key, value) {
    "${key}:${value}"
}

The body can contain:

  • String literals with variable interpolation: "${key}:${value}"
  • Variable references: a bare variable name as an expression
  • let declarations: let full = "${first} ${last}"
  • Variable reassignment: x = upper(x)
  • Calls to other pure functions and pure built-in functions

The return value is the last expression in the body, the same rule as regular functions. A function ending with a let returns the assigned value. A function ending with a string literal returns that string.

Here is a pure function that uses let for an intermediate value:

pure fn build_greeting(first, last) {
    let full = "${first} ${last}"
    upper(full)
}

build_greeting("jane", "doe") returns "JANE DOE". The let binds the concatenated name, then upper() — a pure built-in function — transforms it to uppercase. The result of upper(full) is the last expression, so it becomes the return value.

What pure functions cannot do

The trade-off for calling pure functions anywhere is that their bodies cannot interact with a shell. Every shell operator is forbidden:

  • Send operators: >, =>
  • Match operators: <=, <?
  • Timeout operators: ~, @
  • Fail pattern operators: !?, !=

If you try to use a shell operator inside a pure function, relux check reports the error:

pure fn bad() {
    > echo "side effect"
}
error: shell operator cannot be used in a pure function

Pure functions also cannot call regular (impure) functions or impure built-in functions. Calling a function that needs a shell from inside a function that has no shell makes no sense, so Relux rejects it:

fn impure_helper() {
    > echo "side effect"
    <? ^side effect$
}

pure fn bad() {
    impure_helper()
}
error: impure_helper/0 cannot be used in a pure function

The same applies to impure built-in functions like match_prompt(), match_ok(), sleep(), log(), and the ctrl_* family — they all require a shell:

pure fn bad() {
    match_prompt()
}
error: match_prompt/0 cannot be used in a pure function

These checks happen at compile time. You do not need to run the test to discover the mistake — relux check catches it before anything executes.

Where you can call pure functions

The key advantage of pure functions is that they work in every expression context, not just inside shell blocks. Here is a summary of all the places you can call them:

Inside a shell block, just like regular functions:

pure fn greet(name) {
    "hello ${name}"
}

test "call pure function in shell" {
    shell s {
        let result = greet("world")
        > echo ${result}
        <? ^hello world$
    }
}

In a test-scope let, before any shell block:

pure fn tag(key, value) {
    "${key}:${value}"
}

test "pure function in test-scope let" {
    let label = tag("env", "test")
    shell s {
        > echo ${label}
        <? ^env:test$
    }
}

In an effect-scope let, to compute values during effect setup. The let sits outside the shell block, so only pure functions can be called here. Using the same tag function from above:

effect Config {
    expose service

    let label = tag("env", "production")
    shell service {
        > echo ${label}
        <? ^env:production$
        match_ok()
    }
}

In overlay values for start declarations — overlays are evaluated outside any shell, so pure functions are the only way to compute them dynamically:

pure fn make_label(name) {
    "label-${name}"
}

effect Labeled {
    expect LABEL
    expose service

    shell service {
        > echo $LABEL
        <? ^${LABEL}$
        match_ok()
    }
}

test "pure function in overlay" {
    start Labeled as l {
        LABEL = make_label("production")
    }
    shell l.service {
        > echo $LABEL
        <? ^label-production$
    }
}

In other pure function bodies:

pure fn wrap(s) {
    "[${s}]"
}

pure fn double_wrap(s) {
    wrap(wrap(s))
}

double_wrap("hi") returns "[[hi]]". Pure functions compose naturally — each call evaluates and returns a string, which becomes the argument to the next call.

Condition markers can also call pure functions, but that is covered in a later article.

What “pure” means in Relux

If you are familiar with functional programming, the word “pure” might suggest a function that is deterministic and free of side effects — calling it with the same arguments always produces the same result.

Relux uses a narrower definition. Two pure built-in functions violate the functional programming definition: uuid() and rand() return different values on each call. They are non-deterministic, yet Relux considers them pure.

In Relux, “pure” means shell-independent. A pure function does not read from or write to any PTY session. It operates on string values only and does not require an output buffer. This is a narrower guarantee than functional purity, but it is the guarantee that matters — it determines where a function can be called.

If a function does not use shell operators, it can be pure fn. If it sends commands or matches output, it must be a regular fn. That is the only distinction.

Best practices

Prefer pure fn when a function has no shell operators

You might write a helper as a regular function out of habit, because you first use it inside a shell block:

fn format_url(host, port) {
    "${host}:${port}/api"
}

This works fine in shell context. But later, when you want to use the same helper in a test-scope let or an overlay value, you discover it does not work — regular functions require a shell. You then have to go back and add the pure keyword.

Save yourself the trip: if a function body contains no shell operators, define it as pure fn from the start. It works in all the same places a regular function works, plus everywhere else.

Extract complex interpolation into a pure function

When string interpolation gets deeply nested, the intent can become hard to read:

test "nested interpolation" {
    let host = "localhost"
    let port = "5432"
    let db = "myapp"
    shell s {
        > psql "postgres://${host}:${port}/${db}?sslmode=disable"
        <? ^connected$
        match_prompt()
    }
}

This is manageable, but as the string grows — multiple parameters, conditional segments, repeated patterns — readability suffers. A pure function gives the construction a name and keeps the test body focused on intent:

pure fn pg_url(host, port, db) {
    "postgres://${host}:${port}/${db}?sslmode=disable"
}

test "extracted into pure function" {
    let host = "localhost"
    let port = "5432"
    shell s {
        let url = pg_url(host, port, db)
        > psql "${url}"
        <? ^connected$
        match_prompt()
    }
}

Try it yourself

Write a pure function format_config(app, env, port) that returns a structured string like "app=myapp env=prod port=8080".

  1. Call it from a test-scope let and verify the result by echoing it in a shell block
  2. Call it directly inside a shell block and use the return value in a send
  3. Write a second pure function format_config_upper(app, env, port) that calls format_config and passes the result through upper(). Verify it returns "APP=MYAPP ENV=PROD PORT=8080".

Next: Cleanup — teardown blocks for effects and tests

Cleanup

Previous: Pure Functions

The effects article introduced effects as reusable infrastructure — start a database, launch a service, tail a log file. Relux handles the lifecycle of those services automatically: when a test ends, it terminates all effect shells, which kills any processes running in them. You do not need to stop services yourself.

But services are not the only thing effects and tests create. A database effect might generate a data directory. A build effect might produce temporary files. A test might create artifacts that should not survive past the run. These leftovers are not tied to any shell — killing the shell does not clean them up.

Cleanup blocks solve this. They let you attach teardown commands to an effect or a test — commands that run after the test completes, regardless of whether it passed or failed. Their job is to remove temporary files, collect logs into an artifacts directory, or undo any filesystem side effects that setup left behind.

Here is an effect that creates a temporary working directory during setup and removes it during cleanup, using ${__RELUX_RUN_ID} to ensure the directory is unique per test run:

effect TempWorkspace {
    expose workspace

    shell workspace {
        > mkdir -p /tmp/relux-${__RELUX_RUN_ID}
        match_ok()
        > cd /tmp/relux-${__RELUX_RUN_ID}
        match_ok()
    }
    cleanup {
        > rm -rf /tmp/relux-${__RELUX_RUN_ID}
    }
}

And here is a test with its own cleanup block:

test "test-level cleanup removes artifacts" {
    shell s {
        > touch /tmp/test-artifact-${__RELUX_RUN_ID}
        match_ok()
        > test -f /tmp/test-artifact-${__RELUX_RUN_ID} && echo "exists"
        <? ^exists$
    }
    cleanup {
        > rm -f /tmp/test-artifact-${__RELUX_RUN_ID}
    }
}

When the test finishes, Relux terminates all test and effect shells first — stopping any running processes — then spawns fresh cleanup shells to run the teardown commands. The syntax and behavior are the same in both cases.

The cleanup block

A cleanup block goes inside an effect or test definition, after the shell blocks. It starts with the cleanup keyword followed by a body in braces. Each effect or test can have at most one cleanup block.

effect WithCleanup {
    expose service

    shell service {
        > touch /tmp/cleanup-test-marker
        match_ok()
    }
    cleanup {
        > rm -f /tmp/cleanup-test-marker
    }
}

Here, the effect creates a marker file during setup and removes it during cleanup.

A fresh shell

Cleanup does not run in the effect’s shell, or the test shell. Relux spawns a new, implicit shell dedicated to cleanup. This is a deliberate design choice: by the time cleanup runs, the original shells have already been terminated. Even if they were still around, they might be in an unpredictable state — a command may have crashed, a prompt may be missing, the buffer may contain unexpected output. A fresh shell sidesteps all of that. Cleanup starts from a clean slate every time.

This means you cannot rely on working directory changes or any shell-level state from the original shells. However, cleanup does have access to variables declared at the effect or test level with let, overlay variables (for effects), and environment variables. If cleanup needs to know a path or a port number, declare it as a top-level let variable so both the shell blocks and the cleanup block can reference it.

Allowed operations

Cleanup blocks support a restricted set of operations:

  • Send (>) — send a command to the cleanup shell
  • Raw send (=>) — send input without a trailing newline
  • Let (let) — declare a variable
  • Assignment — reassign an existing variable

That is the complete list. These operations are enough to run teardown commands and organize them with local variables.

What you cannot do

Cleanup blocks do not support match operators (<=, <?), function calls, timeouts, fail patterns, or buffer resets. This applies to both effect and test cleanup blocks. Relux enforces these restrictions at parse time — relux check rejects any cleanup block that uses a disallowed operation.

The reason is pragmatic: cleanup exists to run teardown after something has already gone wrong — a test failed, a timeout fired, a match never arrived. If cleanup itself could fail on a match, Relux would need to handle a failure during failure recovery. That is the classic panic-on-unwind problem: the teardown path must not introduce new failures, or the system becomes unpredictable. Restricting cleanup to fire-and-forget operations keeps the teardown path simple and reliable.

Best-effort execution

Cleanup always runs, whether the test passed, failed, or timed out. And cleanup errors never change the test result. If a cleanup command fails — the file does not exist, the directory is already gone, the shell cannot start — Relux logs the issue but reports the test result based on the test body alone.

This means you do not need to worry about cleanup failures masking real test results or causing false negatives. A flaky teardown command will not turn a passing test red.

Execution order

The full lifecycle of a test run is:

  1. Effect setup (topological order)
  2. Test body
  3. Shell termination — all test shells, then all effect shells
  4. Test cleanup (if present)
  5. Effect cleanup (reverse topological order)

All shells are terminated before any cleanup runs. This guarantees that every process started during setup or the test body is dead before cleanup begins — cleanup only deals with what those processes left behind on the filesystem.

Test cleanup runs before effect cleanup. This way, if the test’s teardown depends on files that effects created, those files are still present. Effect cleanup then unwinds in reverse topological order — dependents first, dependencies last.

Consider a chain of effects where each layer creates files that later layers depend on:

effect BuildApp {
    expose artifact

    shell artifact {
        > mkdir -p /tmp/build && echo "compiled" > /tmp/build/app.bin
        match_ok()
    }
    cleanup {
        > rm -rf /tmp/build
    }
}

effect GenerateConfig {
    start BuildApp
    expose configuration

    shell configuration {
        > echo "db=localhost" > /tmp/build/config.ini
        match_ok()
    }
    cleanup {
        > rm -f /tmp/build/config.ini
    }
}

effect DeployLocal {
    start GenerateConfig
    expose deployment

    shell deployment {
        > cp /tmp/build/app.bin /tmp/deploy/ && cp /tmp/build/config.ini /tmp/deploy/
        match_ok()
    }
    cleanup {
        > rm -rf /tmp/deploy
    }
}

Setup runs in dependency order:

  1. BuildApp — create the build directory and binary
  2. GenerateConfig — write a config file into the build directory
  3. DeployLocal — copy artifacts to the deploy directory

Cleanup runs in the opposite direction:

  1. DeployLocal cleanup — remove the deploy directory
  2. GenerateConfig cleanup — remove the config file from the build directory
  3. BuildApp cleanup — remove the entire build directory

This ordering matters. If BuildApp cleaned up first, it would delete /tmp/build — including the config file that GenerateConfig‘s cleanup is about to target. Reverse topological order guarantees that each cleanup step runs while its dependencies’ files still exist.

Overlay variables in cleanup

As described above, cleanup can see top-level let variables and environment variables. For effects, there is an additional mechanism: overlay variables from the start site are also available in cleanup. This is useful when the cleanup needs to act on configuration that varies per instance:

effect TempDir {
    expect DIR
    expose workspace

    shell workspace {
        > mkdir -p ${DIR}
        match_ok()
        > cd ${DIR}
        match_ok()
    }
    cleanup {
        > rm -rf ${DIR}
    }
}

test "temporary directory is cleaned up" {
    start TempDir as t {
        DIR = "/tmp/relux-test-workspace"
    }
    shell t.workspace {
        > touch testfile.txt
        match_ok()
    }
}

The TempDir effect declares expect DIR and uses ${DIR} in both setup and cleanup. The value comes from the overlay at the start site. During cleanup, Relux interpolates ${DIR} to /tmp/relux-test-workspace, so the rm -rf targets the right directory.

Overlays are the mechanism for making a single effect definition work across different configurations — the same TempDir effect with different DIR values creates and cleans up different directories. Test-level cleanup does not have overlay variables, but it can use top-level let variables and environment variables instead.

Best practices

Do not use cleanup to stop services

It is natural to think of cleanup as the place to stop a database or kill a service you started during setup. But Relux already handles this: when a test ends, it terminates all effect and test shells, which kills any processes running in them. Services started in a shell block die automatically with the shell — they are children of the PTY, so when Relux terminates the shell, the process goes with it. Even if Relux itself is killed, the OS cleans up the PTY and its children.

Using cleanup to stop services is actually worse than relying on shell termination. Cleanup runs in a separate shell — it has no connection to the process running in the effect’s shell. If Relux crashes or is killed, cleanup never runs, and any service you expected cleanup to stop is left orphaned.

For the same reason, avoid starting daemonized or background services (processes that detach from the shell) during setup. A daemonized process is no longer a child of the PTY — it survives shell termination. If Relux is killed or terminated abnormally, neither shell termination nor cleanup can reach it, and it stays running indefinitely. Always run services in the foreground so they remain tied to the shell’s lifecycle.

Reserve cleanup for things that shell termination does not handle: removing files, cleaning up directories, collecting logs, or any other filesystem side effects that outlive the shell.

Keep cleanup self-contained

Cleanup can see top-level let variables, overlay variables (for effects), and environment variables — but it cannot see variables declared inside shell blocks or call functions. Shell-level let bindings and regex captures from the test body are not available.

Plan your cleanup around top-level variables. If a path or identifier is needed in both setup and cleanup, declare it with let at the effect or test level rather than inside a shell block.

Make cleanup idempotent

Cleanup runs regardless of whether setup completed successfully. If an effect’s shell block fails halfway through — the database started but the migration crashed — cleanup still runs. This means cleanup commands may encounter a partially initialized state: a file that was never created, a process that was never started, a directory that is already empty.

Write cleanup commands defensively. Assume nothing about what actually happened during setup — cleanup should be safe to run in any state, including when setup did nothing at all.

Try it yourself

Take the two-effect dependency chain from the previous article’s challenge and add cleanup:

  1. Add a cleanup block to StartDb that removes the data directory it created during setup
  2. Add a cleanup block to Migrate that removes any migration log files
  3. Add a test-level cleanup block that removes any test-specific temporary files
  4. Think about the execution order: which cleanup runs first? Verify your understanding matches the reverse topological rule

Next: Modules and Imports — organizing a multi-file test suite with shared effects and functions

Modules and Imports

Previous: Cleanup

The previous articles built up everything you need to test programs thoroughly, but every example so far has lived in a single file. As a test suite grows, you end up with the same helper functions and effect definitions duplicated across test files. Change the startup sequence for a service, and you are editing the same code in five different places.

Relux solves this with modules and imports. Every .relux file is a module. You put shared code in the lib/ directory, and test files import what they need.

Here is a library module at lib/utils/greeter.relux:

fn greet(name) {
    > echo "hello ${name}"
    <? ^hello ${name}$
    match_prompt()
}

fn farewell(name) {
    > echo "goodbye ${name}"
    <? ^goodbye ${name}$
    match_prompt()
}

effect Greeter {
    expose service

    shell service {
        > export GREETER_STATUS=running
        match_ok()
    }
}

Two functions and an effect. Now a test file can pull in what it needs:

import utils/greeter

test "say hello" {
    shell s {
        greet("alice")
    }
}

The import line brings everything from the library module into scope. Change greet in one place, and every test that imports it picks up the change.

Every file is a module

A .relux file is automatically a module. There is no special declaration — the file’s path relative to the project root determines its module identity.

A file at lib/utils/greeter.relux has the module path utils/greeter. The .relux extension and the lib/ prefix are stripped; what remains is the module path you use in import statements.

Project structure

A Relux project has two top-level directories under the project root (where Relux.toml lives):

  • lib/ — shared modules containing functions, pure functions, and effects. These are never run directly as tests.
  • tests/ — test files. Each .relux file here is discovered and executed by relux run.

Import paths always resolve from lib/. When you write import utils/greeter, Relux looks for lib/utils/greeter.relux. It does not matter where the importing file is — a test at tests/deep/nested/test.relux still imports utils/greeter the same way. This keeps import statements consistent across the entire project: the same module path always means the same file.

Selective imports

The most explicit form of import names exactly which items you want from a module:

import utils/greeter { greet }

This brings the greet function into scope. The module utils/greeter may export other things — in this case it also defines farewell — but only greet is available in this file. Calling farewell would be an error.

You can import multiple items from the same module by listing them — both functions and effects:

import utils/greeter { greet, StartGreeter }

This pulls in the greet function and the StartGreeter effect. Functions use snake_case names, effects use CamelCase — the naming convention is how Relux (and the reader) can tell them apart at a glance.

Trailing commas are allowed:

import utils/greeter {
    greet,
    StartGreeter,
}

Wildcard imports

If you want everything a module exports, leave out the braces:

import utils/greeter

This brings all exported names into scope — both greet and farewell, as well as the Greeter effect in this case:

import utils/greeter

test "wildcard import makes all functions available" {
    start Greeter
    shell s {
        greet("world")
        farewell("world")
    }
}

Wildcard imports are convenient for small, focused modules where you know you want everything. For larger modules, selective imports make the dependencies clearer.

Aliases

Sometimes an imported name collides with something in your file, or you simply want a shorter or more descriptive name. The as keyword renames an import:

import utils/greeter { greet as hello, farewell as bye }

Now hello and bye are the callable names — the originals greet and farewell are not in scope. Aliases work for effects too:

import utils/greeter { Greeter as Svc }

test "aliased effect" {
    start Svc as svc
    shell svc.service {
        > echo $$GREETER_STATUS
        <? ^running$
    }
}

There is one rule: aliases must preserve casing kind. A snake_case function must be aliased to another snake_case name. A CamelCase effect must be aliased to another CamelCase name. Aliasing greet as Hello or Greeter as greeter is a compile error — the casing convention is structural, not cosmetic.

What gets exported

A module exports everything it defines:

  • All fn definitions
  • All pure fn definitions
  • All effect definitions

Test definitions are not exported. A test block is local to the file it appears in — you cannot import a test from another module.

There is no visibility modifier. If a function exists in a module, it is exported. If you do not want something exported, the only option is to not put it in a shared lib/ module — though in practice this is rarely a concern. Functions in library modules are there to be shared.

Try it yourself

  1. Create a library module at lib/helpers.relux with two functions: check_running() that echoes “running” and matches it, and check_stopped() that echoes “stopped” and matches it. Add an effect StartWorker that exports a shell and sets an environment variable WORKER_STATUS=active.

  2. Write a test file tests/selective_test.relux that selectively imports only check_running and StartWorker. Write one test that calls check_running() in a shell, and another that needs StartWorker and verifies the environment variable.

  3. Write a second test file tests/wildcard_test.relux that uses a wildcard import from the same module. Write a test that uses both check_running() and check_stopped().

  4. In a third test file, import check_running as verify_up and StartWorker as Worker. Write a test that uses both under their aliased names.


Next: Condition Markers — conditionally skipping or running tests based on environment

Condition Markers

Previous: Modules and Imports

Integration tests exercise real systems, and real systems have prerequisites. Some tests only make sense on a particular operating system. Some require a tool like docker or psql to be installed. Some are too slow to run locally on every test run, and belong exclusively to CI.

Without a way to express these assumptions, a missing prerequisite looks the same as a broken test. If a test needs docker and docker is not installed, the test fails — and the person reading the results cannot tell whether the system under test is broken or the machine simply was not set up for that test. The failure is ambiguous and unhelpful.

Condition markers solve this in two ways. First, they let you categorize tests by environment — this group runs on macOS, that group runs on Linux, these long-running tests only run in CI. Second, they let you guard against missing preconditions — if the required tool is not available, the test is skipped with an informative reason instead of failing with a confusing error.

Here is a test that only runs when docker is available:

# skip unless which("docker")
test "build container image" {
    shell s {
        > docker build -t myapp .
        <? ^Successfully built
        match_prompt()
    }
}

When docker is in PATH, the test runs normally. When it is not, Relux skips the test and reports exactly why — no shell is spawned, no confusing failure appears.

And here is a test that only runs in CI:

# run if "${CI}"
test "full regression suite" {
    shell s {
        > ./run-all-benchmarks.sh
        <? ^All benchmarks passed$
        match_prompt()
    }
}

Locally, where CI is not set, this test is silently skipped. On the build server, it runs.

Unconditional markers

The simplest form of a marker has no condition at all. There are three kinds:

# skip unconditionally skips the test. This is useful for temporarily disabling a test without deleting or commenting it out:

# skip
test "work in progress" {
    shell s {
        > echo hello
        <? ^hello$
    }
}

The test appears in the results as skipped. When you are ready to re-enable it, remove the marker.

# flaky marks a test as known-unstable. When [flaky].max_retries is set in Relux.toml, a failing flaky test is retried from scratch with exponentially increasing tolerance timeouts. With the default max_retries = 0, the marker is documentary only and the test runs normally:

# flaky
test "timing sensitive" {
    shell s {
        > echo hello
        <? ^hello$
    }
}

Configure retry behavior in Relux.toml:

[flaky]
max_retries = 3           # retry up to 3 times on failure
timeout_multiplier = 1.5   # tolerance timeouts scale by 1.5^retry

Or override from the command line:

relux run --flaky-retries 3 --flaky-multiplier 2.0

Each retry runs the test from scratch — fresh shell, fresh effects. Tolerance timeouts (~) are scaled by multiplier^(retry-1); assertion timeouts (@) are never scaled. If any retry passes, the test is reported as passed. If all retries are exhausted, it is reported as failed.

# run without a condition is a no-op — the test runs as it normally would. On its own it has no effect, but it becomes useful with a condition attached, as shown below.

Conditional markers

A condition adds an if or unless modifier and an expression to the marker. The expression is evaluated before any shells are spawned.

Truthiness checks

The simplest conditional form checks whether an environment variable is set and non-empty:

# skip if "${MY_VAR}"
test "skipped when MY_VAR is set" {
    shell s {
        > echo hello
        <? hello
    }
}

The truthiness rule is straightforward: an empty string or an unset variable is false (falsy). Any non-empty string is true (truthy).

The unless modifier inverts the check:

# skip unless "${CI}"
test "only runs in CI" {
    shell s {
        > echo hello
        <? hello
    }
}

This skips the test unless CI is set — the common pattern for CI-only tests.

The run kind works the other way around. Where skip says “do not run this test when the condition is met”, run says “only run this test when the condition is met”:

# run if "${MY_VAR}"
test "only runs when MY_VAR is set" {
    shell s {
        > echo hello
        <? hello
    }
}

And its inverse:

# run unless "${MY_VAR}"
test "runs when MY_VAR is not set" {
    shell s {
        > echo hello
        <? hello
    }
}

Note that # run if "${X}" and # skip unless "${X}" are logically equivalent — both skip the test when X is unset. The choice between them is about readability, which the best practices section below discusses.

Equality comparisons

When truthiness is not enough, you can compare a variable against a specific value using =:

# skip if "${MY_VAR}" = "yes"
test "skipped when MY_VAR is exactly yes" {
    shell s {
        > echo hello
        <? ^hello$
    }
}

Both sides of the = support variable interpolation. You can build compound values:

# run if "${HOST}:${PORT}" = "localhost:8080"
test "only on local dev server" {
    shell s {
        > curl localhost:8080/health
        <? ^ok$
        match_prompt()
    }
}

Numbers are allowed too — they are compared as strings:

# run if "${COUNT}" = 0
test "only when count is zero" {
    shell s {
        > echo "starting fresh"
        <? ^starting fresh$
    }
}

Regex matching

For more flexible matching, the ? operator tests a value against a regex pattern:

# skip unless "${MY_VAR}" ? ^(yes|true)$
test "requires MY_VAR to be yes or true" {
    shell s {
        > echo hello
        <? ^hello$
    }
}

The regex pattern supports variable interpolation as well:

# skip unless "${ARCH}" ? ^(x86_64|aarch64)$
test "only on 64-bit architectures" {
    shell s {
        > echo hello
        <? ^hello$
    }
}

Pure function calls in markers

Marker expressions are not limited to variable interpolation. You can call pure functions to compute values or perform checks. This is where markers become truly powerful for asserting environment preconditions.

The built-in function which() checks whether an executable exists in PATH — it returns the path if found, or an empty string (falsy) if not:

# skip unless which("docker")
test "needs docker" {
    shell s {
        > docker ps
        <? ^CONTAINER ID
        match_prompt()
    }
}

You can also define your own pure functions for more complex checks:

pure fn always_true() {
    "yes"
}

# skip if always_true()
test "always skipped by custom function" {
    shell s {
        > echo hello
        <? ^hello$
    }
}

Pure functions combine naturally with regex matching. Here, normalize lowercases the value before the comparison:

pure fn normalize(val) {
    lower(val)
}

# skip unless normalize("${TARGET_OS}") ? ^(linux|darwin)$
test "only on Linux or macOS" {
    shell s {
        > echo hello
        <? ^hello$
    }
}

The function argument uses variable interpolation, and the regex tests the lowercased result. This handles cases where the environment variable might be "Linux", "LINUX", or "linux".

Multiple markers

A test or effect can carry more than one marker:

# skip unless "${CI}"
# skip if "${SKIP_ME}"
test "CI only, unless explicitly skipped" {
    shell s {
        > echo hello
        <? ^hello$
    }
}

The exact combination semantics for multiple markers are not yet established and are the subject of an upcoming RFC. For now, keep things simple: use a single marker per test or effect when possible, and use regex patterns to express complex conditions within one marker.

Markers on functions

Markers work on functions and pure functions too:

# skip unless which("jq")
fn parse_json(input) {
    > echo '${input}' | jq -r '.name'
    <? ^.+$
    let name = $0
    match_prompt()
    name
}

test "extract name from JSON" {
    shell s {
        let name = parse_json('{"name": "alice"}')
        > echo "${name}"
        <? ^alice$
        match_prompt()
    }
}

The key behavior: when a function is skipped, all tests that call it are also skipped. In the example above, if jq is not installed, the parse_json function is skipped, which propagates to every test that calls it. The test is reported as skipped — no shell is spawned, no confusing failure appears. This works the same way for both fn and pure fn.

Markers on effects

Markers work on effects too. This is particularly useful for effects that provision heavy infrastructure:

# skip if "${SKIP_EFFECT}"
effect Guarded {
    expose service

    shell service {
        > echo "effect ran"
        <? ^effect ran$
    }
}

test "depends on conditionally skipped effect" {
    start Guarded as g
    shell g.service {
        > echo "test body ran"
        <? ^test body ran$
    }
}

There is one important rule: when an effect is skipped, all tests that depend on it are also skipped. This cascades through the dependency graph. If effect A is skipped and test X needs A, test X is skipped too — even if test X has no markers of its own. The reasoning is straightforward: if the effect could not set up the infrastructure the test requires, running the test would be meaningless.

Evaluation timing and scope

Markers evaluate before any shells are spawned. For test-level markers, this happens before the test’s effects are even set up. For effect-level markers, it happens before the effect’s own shells are created.

Because of this early evaluation, marker expressions can only see environment variables — the base environment that Relux inherits from the system plus any variables set in Relux.toml. Variables declared with let inside tests or effects do not exist yet at marker evaluation time. This is why marker syntax uses "${VAR}" to reference the environment, the same interpolation syntax you already know.

Best practices

Markers assert, effects provision

The distinction is:

  • Markers assert what the environment already has — an installed binary, a particular OS, a running CI server. These are things outside the test’s control.
  • Effects provision what the test needs — starting a service, creating a temp directory, seeding a database. These are things the test can set up and tear down.

If you can set it up, use an effect. If you can only check for it, use a marker. A test that needs a PostgreSQL database running should have an effect that starts one. A test that needs psql to be installed should have a marker that checks for it.

Choose the marker that reads like intent

# run if "${CI}" and # skip unless "${CI}" are logically identical — both skip the test when CI is not set. The difference is how they communicate intent to someone reading the test file.

Use # run if ... when the condition describes the target environment: “this test runs in CI.” Use # skip unless ... when the condition describes a requirement: “skip this test unless docker is available.” The marker should read like a sentence that explains why the test might not run.

Understand effect skip propagation

Putting a marker on an effect skips every test that depends on it. This is powerful but can be surprising. If an effect is shared by many tests, a single marker on that effect gates a large part of the suite. Before adding a marker to a widely-used effect, consider whether the marker belongs on the individual tests instead.

Try it yourself

  1. Write a test that only runs on macOS. Use a pure function that calls which("sw_vers") (a macOS-specific binary) to detect the platform, and a # skip unless ... marker.

  2. Write an effect DockerReady that guards itself with # skip unless which("docker"). Have it start a container in its shell block. Then write a test that starts DockerReady — verify that the test is skipped when docker is not available, without needing its own marker.

  3. Write a test with two markers: one that restricts it to CI (# run if "${CI}") and one that skips it when a feature flag is disabled (# skip unless "${ENABLE_SLOW_TESTS}"). Think about what happens in each combination of those two variables.


Next: The CLI — complete coverage of relux new, check, run, and history

The CLI

Previous: Condition Markers

This is the final article in the tutorial series. You have come a long way — from your first test through send and match, variables, functions, effects, modules, and condition markers. You now know the entire Relux DSL. Congratulations — that is a real achievement.

This article covers the tool that drives everything: the relux binary itself. You have already used relux new, relux check, and relux run throughout the series. Here we go deeper into every subcommand, every flag, and the workflows they enable.

Here is a typical development cycle, end to end:

relux new                           # scaffold a project
relux new --test smoke/login        # create a test module
# ... write the test ...
relux check                         # validate without running
relux run                           # execute the full suite
relux run --rerun                   # re-run only the failures
relux history --flaky               # spot intermittent tests

Each of these commands has options that give you precise control over what runs, how it runs, and what output you get.

relux new

The new subcommand scaffolds projects and modules. Without any flags, it initializes a new Relux project in the current directory:

relux new

This creates:

Relux.toml
relux/
  .gitignore
  tests/
  lib/

The generated Relux.toml has all values commented out, showing the defaults. The .gitignore excludes out/ — the directory where test run output goes.

Running relux new in a directory that already has a Relux.toml is an error. The command will not overwrite an existing project.

Scaffolding modules

To create a test module:

relux new --test auth/login

This creates relux/tests/auth/login.relux with a starter test you can run immediately. The path you provide maps directly to the filesystem under relux/tests/. Intermediate directories are created automatically.

To create an effect module:

relux new --effect services/database

This creates relux/lib/services/database.relux with a skeleton effect definition. Effect modules go under relux/lib/, matching the module resolution rules you already know.

Module paths must follow snake_case rules: lowercase letters, digits, and underscores. Each segment must start with a letter or underscore. The .relux extension is added automatically — you do not need to include it.

The --test and --effect flags are mutually exclusive. You can create one or the other per invocation.

relux check

The check subcommand validates test files without executing them. It runs the full front end of the pipeline — lexer, parser, and resolver — catching syntax errors, unresolved names, invalid imports, and circular dependencies. No shells are spawned.

relux check

Without arguments, it checks everything under relux/tests/. You can also target specific files or directories:

relux check relux/tests/auth/
relux check relux/tests/smoke/login.relux

On success, it prints check passed to stderr. On failure, it prints diagnostic errors with source locations and exits with status 1.

FlagPurpose
--manifest PATHUse a specific Relux.toml instead of auto-discovering one

relux run

The run subcommand executes tests. This is the main event — everything else in the CLI exists to support it.

relux run

Without arguments, it runs all tests under relux/tests/. Use -f (or --file) to target specific files or directories:

relux run -f relux/tests/smoke/
relux run -f relux/tests/auth/login.relux -f relux/tests/auth/signup.relux

Use -t (or --test) to run specific tests by name within a single file:

relux run -f relux/tests/auth/login.relux -t "login with valid credentials"

The --test flag requires exactly one --file and can be repeated to select multiple tests.

Parallel execution

By default tests run sequentially. The -j (or --jobs) flag sets the number of parallel workers:

relux run -j 4

You can also set the default in Relux.toml:

[run]
jobs = 4

The CLI flag overrides the config value. Each test gets its own isolated set of effects — parallel tests never share state.

When running in parallel, the final summary reports both wall-clock time and cumulative (sum of all workers) time.

Progress and strategy

Two flags control the experience during a run:

relux run --progress tui --strategy fail-fast

--progress controls the progress display mode:

  • auto — show a live TUI when connected to a TTY, plain output otherwise (default)
  • plain — print only result lines as tests finish, no live progress
  • tui — force the live TUI even when not connected to a TTY

--strategy controls what happens when a test fails:

  • all — run every test regardless of failures (default)
  • fail-fast — stop at the first failure

Timeout multiplier

The -m (or --timeout-multiplier) flag scales tolerance timeouts:

relux run -m 2.0

This doubles every tolerance (~) timeout in the suite. If a shell-scoped ~10s would normally wait 10 seconds, with -m 2.0 it waits 20 seconds.

Critically, assertion (@) timeouts are never scaled. An @2s timeout means “the system must respond within 2 seconds” — that is a correctness check, and stretching it would defeat its purpose.

The default multiplier is 1.0. It must be a positive finite number.

Re-running failures

After a run with failures, you can re-run only the failed tests:

relux run --rerun

This loads the latest run summary from relux/out/latest, identifies which tests failed, and runs only those. It ignores any --file flags you provide — the filter comes entirely from the previous run.

If there are no previous runs or no failed tests, the command exits cleanly with status 0.

Output artifacts

Every run creates a timestamped directory under relux/out/:

relux/out/
  run-2026-03-16-14-30-00-a1b2c3d4e5/
    artifacts/
    run_summary.toml
    index.html
    ...
  latest -> run-2026-03-16-14-30-00-a1b2c3d4e5/

The latest symlink always points to the most recent run. The run_summary.toml file stores the run summary — this is what --rerun and relux history read.

Two flags generate additional artifacts in the artifacts/ subdirectory:

relux run --tap --junit

--tap generates a TAP (Test Anything Protocol) file — a plain-text format understood by many CI systems.

--junit generates a JUnit XML file — the de facto standard for CI test result ingestion. Most CI platforms (Jenkins, GitHub Actions, GitLab CI) can parse JUnit XML to display test results in their UI.

Both flags can be used together. They are independent of each other and of the console output.

Flaky retries

Tests marked with the @flaky condition marker can be automatically retried on failure. The --flaky-retries flag sets the maximum retry count:

relux run --flaky-retries 3

By default, each retry applies an exponential timeout multiplier so that tolerance timeouts grow across attempts. The --flaky-multiplier flag controls the base of that multiplier (default: 1.5):

relux run --flaky-retries 3 --flaky-multiplier 2.0

Timeout overrides

You can override the per-test and suite timeouts from the command line, without editing Relux.toml:

relux run --test-timeout 2m --suite-timeout 1h

These accept the same humantime format as the config file (5s, 1m30s, 2h).

All run flags

FlagShortDefaultPurpose
--file-frelux/tests/Test file or directory to run (repeatable)
--test-tRun only tests with this name (repeatable; requires one --file)
--jobs-j1Number of parallel test workers
--progressautoDisplay mode: auto, plain, tui
--strategyallRun strategy: all or fail-fast
--timeout-multiplier-m1.0Scale tolerance (~) timeouts by this factor
--rerunRe-run only failed tests from the latest run
--tapGenerate TAP artifact file
--junitGenerate JUnit XML artifact file
--flaky-retriesMax retries for @flaky-marked tests
--flaky-multiplier1.5Exponential timeout multiplier base for retries
--test-timeoutfrom configOverride per-test timeout (humantime string)
--suite-timeoutfrom configOverride suite timeout (humantime string)
--manifestauto-discoverPath to Relux.toml

relux history

The history subcommand analyzes data from previous runs. It reads the run_summary.toml files stored in each run directory under relux/out/ and computes statistics across them.

You must specify exactly one analysis type:

--flaky

Shows the flakiness rate per test — how often each test alternates between passing and failing:

relux history --flaky

This is your first stop when a test starts intermittently failing.

--failures

Shows failure frequency and distribution by failure mode (timeout, assertion, etc.):

relux history --failures

This helps you spot patterns. If most failures are timeouts, you may need to adjust your timeout strategy. If they cluster around a specific assertion, there is a targeted bug.

--first-fail

Shows the most recent pass-to-fail regression per test:

relux history --first-fail

Useful for pinpointing when a test started breaking. Combined with your version control history, this helps trace failures back to specific changes.

--durations

Shows duration trends and statistics — min, max, mean, and trend across runs:

relux history --durations

Use this to catch tests that are getting progressively slower, or to identify outliers that might benefit from tighter timeouts.

Filters

All four analysis types support the same set of filters:

relux history --flaky --tests relux/tests/auth/ --last 10 --top 5
FlagPurpose
--tests PATH...Filter to specific test files or directories
--last NLimit to the N most recent runs
--top NShow only the top N results
--formatOutput format: human (default, formatted tables) or toml (structured, machine-readable)
--manifestPath to Relux.toml

The --format toml option is particularly useful for scripting — pipe the output into another tool or parse it programmatically.

relux completions

The completions subcommand installs shell completions for bash, zsh, or fish. Relux uses dynamic completions — the shell calls back into the relux binary at tab-press time, so completions stay up to date as the CLI evolves.

relux completions

Without any flags, it autodetects your shell from $SHELL and prints what it would do. To actually install:

relux completions --install

For bash and fish, completions are written to standard locations automatically. For zsh, you need to specify a directory in your fpath:

relux completions --shell zsh --install --path ~/.zsh/completions

You can override the shell and install path for any shell:

relux completions --shell fish --install --path ~/my-completions/relux.fish

Once installed, tab completion provides:

  • Subcommands and flags with help descriptions
  • .relux file paths for run, check, and dump commands
  • Relux.toml files for --manifest
  • Timeout presets for --test-timeout and --suite-timeout (multiplied from configured values)
  • Enum values like --progress auto|plain|tui and --strategy all|fail-fast
FlagPurpose
--shell <shell>Override shell detection: bash, zsh, or fish
--installWrite the completion script (dry-run without this flag)
--path <path>Override the install path

Best practices

Use --rerun after fixing a failure

When a run has failures and you think you have fixed the issue, use relux run --rerun instead of re-running the full suite. This targets only the tests that failed last time, giving you faster feedback. Once the reruns pass, do a full relux run to confirm nothing else broke.

Match strategy to context

Use --strategy fail-fast during local development — you want to know about the first failure quickly so you can fix it. Use --strategy all in CI — you want a complete picture of the suite’s health, not just the first problem.

Start flakiness investigation with history

When a test starts failing intermittently, run relux history --flaky before digging into the test code. The flakiness rate tells you whether you are dealing with an environment issue (sporadic) or a logic bug (consistent). If the test passes 95% of the time, you are probably looking at a timing issue. If it passes 50% of the time, there may be a race condition or uncontrolled dependency.


Next: Appendix A1 — Best Practices — all best-practices guidelines from the series in one place

Appendix A1: Best Practices

Previous: The CLI

This appendix collects every best-practices guideline from the tutorial series into a single reference, grouped by topic. Each item links back to the article where it was originally introduced.

Project setup

From Getting Started:

Keep /bin/sh as the default shell

You might be tempted to configure your favorite shell — zsh, fish, bash — as the default in Relux.toml. After all, you use it every day and know its features well.

Resist the temptation. A custom shell means every developer on the team and every CI machine needs that shell installed and configured. /bin/sh is available everywhere, and the operations you need in integration tests — running commands, checking output, setting environment variables — work the same across POSIX shells. The interactive niceties of fancier shells (tab completion, syntax highlighting, advanced globbing) don’t matter when Relux is driving the terminal.

Only switch away from /bin/sh if your system under test genuinely requires a specific shell to function.

Leave timeouts at their defaults

The default match timeout of 5 seconds is generous for most commands. You might think “I’ll set timeout to 500 ms to speed up failure detection”. Don’t — not yet.

Timeout tuning is one of those things that should be driven by actual pain, not preemptive optimization. Tight timeouts cause flaky tests on slower machines or under CI load. The defaults are deliberately conservative. When you encounter a specific situation where the default is genuinely wrong — a command that reliably takes 30 seconds — that’s the time to tune. Relux provides fine-grained timeout control at the operator, shell, and test level, as covered in Timeouts.

The shell prompt must be static

The prompt configured in Relux.toml (default: relux> ) must be a fixed, unchanging string. Do not include dynamic elements like timestamps, git branch names, hostnames, or user-specific paths.

Relux uses the prompt as a reliable marker in the shell output stream. A prompt that changes between commands — or between machines — makes that marker unpredictable, which leads to flaky or outright broken tests. The default relux> is a good choice: short, distinctive, and the same everywhere.

The output buffer

From The Output Buffer:

Always match the prompt

If you are done examining a command’s output, match the prompt. Every time. This is the single most effective habit for avoiding flaky tests.

Without a prompt match, the cursor floats somewhere in the middle of the output. When the buffer contains output from a previous command that was not fully consumed, any pattern that appears in that leftover output will match there first. The cursor advances to an unexpected position, and subsequent matches silently verify stale data.

Matching the prompt anchors the cursor at a command boundary. It is cheap, it is predictable, and it eliminates an entire class of timing-related failures.

Check the exit code

A command can produce the expected output and still fail. Or it can fail silently, producing no output at all, while the match picks up something else entirely. Checking the exit code after a command catches these problems early:

test "verify success" {
    shell s {
        > mkdir -p /tmp/test-dir
        <= relux>
        > echo ==$?==
        <= ==0==
        <= relux>
    }
}

The echo ==$?== / <= ==0== pair is a cheap way to verify the previous command succeeded. The == delimiters are distinctive enough to avoid accidental substring matches. Without this check, a failing mkdir would go unnoticed — the test would continue with a missing directory and fail later with a confusing, unrelated error.

Buffer reset does not respect causality

The buffer reset operator (<= with no pattern) consumes everything currently in the buffer. That “currently” depends on timing — how much output the shell has printed by the instant Relux executes the reset.

If output is still arriving — a command is running, a log line is being flushed — the cursor might land in the middle of a line, or before a line that is about to appear. This creates a race condition: the test might pass on your machine and fail in CI, or pass nine times and fail on the tenth.

In almost every case, there is a better anchor than a buffer reset. Match the prompt. Match a specific log line. Match any known text that marks the boundary you actually care about. These anchors are causal — they mean “this specific thing happened” — rather than temporal — “this is where the buffer happened to be at this moment.”

Only use buffer reset when you are certain all relevant output has already arrived and there is no meaningful boundary to match against.

Regex matching

From Regex Matching:

Use regex only when you need it

You might default to <? everywhere since it is strictly more powerful than <= — any literal match can be written as a regex. But regex matches are harder to read, easier to get wrong, and can match more than you intended.

Literal match <= is a simple substring search. It does exactly one thing and it is obvious what it matches. When you do not need capture groups, anchors, or wildcards, <= is the better choice. Reserve <? for when you genuinely need regex capabilities: extracting values, matching variable output, or anchoring to line boundaries.

Always save captures to named variables

Capture groups like $1 are convenient — you match a pattern, and the extracted value is right there. It is tempting to use $1 directly in several places without saving it to a named variable first.

The problem is not with the code as you write it today. The problem is with the code as someone changes it five years from now. Test code is still code — it evolves, gets refactored, gets extended. Capture groups are silently replaced on every <? match. If someone inserts a new regex match between your capture and its use — a perfectly reasonable edit — $1 now refers to something completely different. No error, no warning, just a test that fails in a confusing way that takes hours to debug.

Save the capture to a named variable immediately after the match, before doing anything else. Then use the named variable everywhere:

// Fragile — $1 can be silently replaced by a later edit:
<? ^port=(\d+)$
> curl http://localhost:${1}/health

// Durable — the port is safe no matter what happens next:
<? ^port=(\d+)$
let port = $1
> curl http://localhost:${port}/health

The named variable survives any number of subsequent matches. It makes the code self-documenting (the name port says more than $1), and it insulates the test from future edits.

Anchor your patterns

A regex without anchors will match anywhere in the remaining buffer — the echoed command, a fragment of the prompt, leftover output from a previous step. This is the same problem as with literal match, but worse, because regex metacharacters like . and * match more broadly.

Use ^ and $ to pin your match to a specific line:

// Might match the echoed command or something unexpected:
<? version=\d+

// Matches exactly one complete line:
<? ^version=\d+$

This does not mean you should anchor every pattern — sometimes a substring regex is what you need. But when you have a choice, anchoring is safer: it documents your intent and prevents accidental matches.

Be careful with interpolated regex patterns

Variable interpolation in <? patterns lets you define reusable regex fragments — declare a pattern once at the test level and use it in multiple matches. This is handy for repeated patterns like timestamps, UUIDs, or version strings.

The catch is that after interpolation, the variable’s value becomes part of the regex. If the value contains regex metacharacters — ., *, +, (, [, and so on — they are interpreted as regex syntax, not as literal text. A variable holding 192.168.1.1 does not match the literal IP address; the . matches any character, so it also matches 192X168Y1Z1.

When the variable comes from your own let and you know the value, this is fine — just be aware of what you are putting into the pattern. When the variable comes from captured output or an environment variable, the content is unpredictable and the regex may compile into something you did not intend, or fail to compile entirely.

Functions

From Functions:

Captures do not survive function calls

You might call a function that internally uses <? and expect the capture groups ($1, $2, …) to be available in the caller afterward. This seems reasonable — the function ran a regex match, and captures are normally available after <?.

But captures are part of the variable scope. When a function returns, its entire scope — including captures — is discarded. The caller’s captures are restored to whatever they were before the call:

fn extract_port() {
    > echo "port=8080"
    <? ^port=(\d+)$
    // The last expression is match_ok(), whose return value is the
    // prompt string — not the captured port number.
    match_ok()
}

test "captures do not survive function calls" {
    shell s {
        // Wrong — $1 holds the caller's capture state, not the function's:
        extract_port()
        > echo "port=${1}"
        <? ^port=8080          // $1 is empty

        // Also wrong — the return value is the prompt string, because
        // match_ok() is the last expression in extract_port():
        let result = extract_port()
        > echo "result=${result}"
        <? ^result=8080        // result is the prompt, not "8080"
    }
}

The fix is to design the function to explicitly return what you need. Save the capture to a local variable before calling match_ok(), then return that variable as the last expression:

fn extract_port() {
    > echo "port=8080"
    <? ^port=(\d+)$
    let port = $1
    match_ok()
    port
}

Now let port = extract_port() in the caller gives you "8080".

This is consistent with the scoping model: functions cannot modify the caller’s variable state. Return values are the explicit, reliable channel for passing data back.

Leave the shell clean

When a function interacts with the shell — sending commands and matching output — it should leave the shell in a known state before returning. That means: consume the prompt and verify the exit code with match_ok() (or the appropriate match_not_ok variant) after the last command.

// Leaves the shell in an unknown state — the caller must
// know what output is left in the buffer:
fn check_server() {
    > curl -s http://localhost:8080/health
    <= healthy
}

// Leaves the shell clean — prompt consumed, exit code verified:
fn check_server() {
    > curl -s http://localhost:8080/health
    <= healthy
    match_ok()
}

A function that leaves unconsumed output or an unchecked exit code forces every caller to clean up after it. That coupling is invisible and fragile — it works until someone adds a new caller that forgets, or the function’s output changes slightly. Close every shell interaction with a clean handoff.

Do not rely on shared shell state

The caller and the function share a shell session. This means the function can read shell-side environment variables set by the caller, and the caller can read shell-side state left behind by the function. Both directions are tempting shortcuts — and both lead to brittle tests.

A function cannot predict the shell state of all its callers. Some callers have not been written yet. If a function depends on a shell-side variable that the caller must set beforehand, the requirement is invisible — nothing in the function signature or call site reveals it. Pass the value as a parameter instead.

In the other direction, a caller that depends on shell-side state set by a function is coupled to the function’s implementation details. If the function’s internals change — a different variable name, a different order of commands — the caller silently breaks.

If you genuinely cannot avoid relying on shared shell state, make it explicit with a comment at both the definition and call site explaining the dependency. But first, consider whether a parameter or return value would work instead.

Keep functions small

A function runs in the caller’s shell, so a long function body means a long sequence of sends and matches executing in someone else’s shell session. When something fails halfway through a 30-line function, the error points to a line inside the function — but understanding why it failed requires knowing what the caller’s shell looked like at the time of the call.

Prefer small functions that do one thing: check a status code, verify a service is running, send a login sequence. If you find a function growing beyond a handful of operations, consider splitting it into smaller pieces — so each has a clear, narrow purpose.

Pure functions

From Pure Functions:

Prefer pure fn when a function has no shell operators

You might write a helper as a regular function out of habit, because you first use it inside a shell block:

fn format_url(host, port) {
    "${host}:${port}/api"
}

This works fine in shell context. But later, when you want to use the same helper in a test-scope let or an overlay value, you discover it does not work — regular functions require a shell. You then have to go back and add the pure keyword.

Save yourself the trip: if a function body contains no shell operators, define it as pure fn from the start. It works in all the same places a regular function works, plus everywhere else.

Extract complex interpolation into a pure function

When string interpolation gets deeply nested, the intent can become hard to read:

test "nested interpolation" {
    let host = "localhost"
    let port = "5432"
    let db = "myapp"
    shell s {
        > psql "postgres://${host}:${port}/${db}?sslmode=disable"
        <? ^connected$
        match_prompt()
    }
}

This is manageable, but as the string grows — multiple parameters, conditional segments, repeated patterns — readability suffers. A pure function gives the construction a name and keeps the test body focused on intent:

pure fn pg_url(host, port, db) {
    "postgres://${host}:${port}/${db}?sslmode=disable"
}

test "extracted into pure function" {
    let host = "localhost"
    let port = "5432"
    shell s {
        let url = pg_url(host, port, db)
        > psql "${url}"
        <? ^connected$
        match_prompt()
    }
}

Timeouts

From Timeouts:

Use the multiplier for CI flakiness, not longer timeouts

When tests start failing on CI but pass locally, the tempting fix is to increase the timeouts in the test files. A ~2s becomes ~5s, then ~10s, and soon every test has generous timeouts that mask real performance regressions.

The multiplier exists for this problem. Keep your timeouts tight — reflecting how fast the system should respond — and use -m 2.0 or -m 3.0 on slow environments. This way, timeouts still catch genuine slowdowns on the developer’s machine while tolerating CI variability.

Choose the prefix, not the position

The ~ vs @ prefix is what determines whether a timeout is environmental tolerance or a system assertion. Both prefixes work at every level — shell scope, inline override, and test definition. Ask yourself: “is this about the environment or about the system?”

  • The CI server is slow → use ~ (tolerance), let -m scale it
  • One specific command is slower than the rest → use ~ with a larger value, or <~ on the match
  • The system must respond within 2 seconds → use @2s or <@2s?
  • The entire test must complete within a bound → use test "name" @5s

Reserve @ for real assertions

If you put @ on everything, the multiplier becomes useless — nothing scales, and slow environments fail. Use @ only when the time boundary is genuinely part of what you are testing. Most timeouts in a typical test suite should be ~ tolerances, with @ reserved for the few cases where timing is the assertion.

Fail patterns

From Fail Patterns:

Set fail patterns early

Place your !? or != as the first statement in a shell block, before any commands. This maximizes coverage — the pattern is active from the very first command output. A fail pattern set after several commands has no protection over the output those commands already produced (the immediate rescan will catch it if it’s in the buffer, but that turns a background monitor into a retroactive check, which is harder to reason about).

Use fail patterns for long-running services

Fail patterns are at their most valuable when testing long-running services that produce logs you don’t exhaustively match on. A web server, a database, a background worker — these emit output continuously, and you only match the specific lines that tell you the service is ready or responding correctly. A fail pattern like !? FATAL|panic|Segfault acts as a safety net across all that unmatched output. You focus your <= and <? operators on expected behavior; the fail pattern catches unexpected crashes in the background.

Don’t use fail patterns as assertions

Fail patterns are background monitors, not replacements for match operators. If you expect specific output, use <= or <? to match it. If you want to ensure something doesn’t appear, that’s what fail patterns are for. The distinction matters: match operators advance the output buffer cursor and participate in the test’s flow; fail patterns operate silently in the background and only surface when something goes wrong.

Combine multiple error strings with regex alternation

Since each shell has only one fail pattern slot, setting a second !? replaces the first. If you need to watch for multiple error patterns, combine them into a single regex using alternation:

shell s {
    !? ERROR|PANIC|FATAL|Segfault
    > start-my-service
    <? ready
    match_prompt()
}

Do not write:

shell s {
    !? ERROR
    !? PANIC
    !? FATAL
    > start-my-service
    <? ready
    match_prompt()
}

Only FATAL is active after line 4 — the first two patterns are gone.

Effects and dependencies

From Effects and Dependencies:

Set fail patterns early in effects

Effects that start long-running services should set a fail pattern before the startup command, just like in a regular shell block. This maximizes coverage — any crash output during startup or during the test body triggers an immediate failure:

effect Service {
    expose service

    shell service {
        !? FATAL|ERROR|panic
        > start-my-service --foreground
        <? listening on port 8080
    }
}

The fail pattern is active from the first line. If the service crashes during startup, the fail pattern catches it before the readiness match even runs.

Deduplication and shared state

Because deduplication means two aliases can point to the same shell, mutations through one alias are visible through the other. This is by design — it is how effects like the database chain work, where each layer builds on the state left by the previous one. But it means you should be aware: if two unrelated parts of a test both alias the same effect instance, they share a single PTY session. Commands sent through one alias affect the shell the other alias sees.

If you need truly independent instances, give them different overlay values — even a dummy key is enough to create separate identities:

start MyEffect as a { INSTANCE = "1" }
start MyEffect as b { INSTANCE = "2" }

Cleanup

From Cleanup:

Do not use cleanup to stop services

It is natural to think of cleanup as the place to stop a database or kill a service you started during setup. But Relux already handles this: when a test ends, it terminates all effect and test shells, which kills any processes running in them. Services started in a shell block die automatically with the shell — they are children of the PTY, so when Relux terminates the shell, the process goes with it. Even if Relux itself is killed, the OS cleans up the PTY and its children.

Using cleanup to stop services is actually worse than relying on shell termination. Cleanup runs in a separate shell — it has no connection to the process running in the effect’s shell. If Relux crashes or is killed, cleanup never runs, and any service you expected cleanup to stop is left orphaned.

For the same reason, avoid starting daemonized or background services (processes that detach from the shell) during setup. A daemonized process is no longer a child of the PTY — it survives shell termination. If Relux is killed or terminated abnormally, neither shell termination nor cleanup can reach it, and it stays running indefinitely. Always run services in the foreground so they remain tied to the shell’s lifecycle.

Reserve cleanup for things that shell termination does not handle: removing files, cleaning up directories, collecting logs, or any other filesystem side effects that outlive the shell.

Keep cleanup self-contained

Cleanup can see top-level let variables, overlay variables (for effects), and environment variables — but it cannot see variables declared inside shell blocks or call functions. Shell-level let bindings and regex captures from the test body are not available.

Plan your cleanup around top-level variables. If a path or identifier is needed in both setup and cleanup, declare it with let at the effect or test level rather than inside a shell block.

Make cleanup idempotent

Cleanup runs regardless of whether setup completed successfully. If an effect’s shell block fails halfway through — the database started but the migration crashed — cleanup still runs. This means cleanup commands may encounter a partially initialized state: a file that was never created, a process that was never started, a directory that is already empty.

Write cleanup commands defensively. Assume nothing about what actually happened during setup — cleanup should be safe to run in any state, including when setup did nothing at all.

Condition markers

From Condition Markers:

Markers assert, effects provision

The distinction is:

  • Markers assert what the environment already has — an installed binary, a particular OS, a running CI server. These are things outside the test’s control.
  • Effects provision what the test needs — starting a service, creating a temp directory, seeding a database. These are things the test can set up and tear down.

If you can set it up, use an effect. If you can only check for it, use a marker. A test that needs a PostgreSQL database running should have an effect that starts one. A test that needs psql to be installed should have a marker that checks for it.

Choose the marker that reads like intent

# run if "${CI}" and # skip unless "${CI}" are logically identical — both skip the test when CI is not set. The difference is how they communicate intent to someone reading the test file.

Use # run if ... when the condition describes the target environment: “this test runs in CI.” Use # skip unless ... when the condition describes a requirement: “skip this test unless docker is available.” The marker should read like a sentence that explains why the test might not run.

Understand effect skip propagation

Putting a marker on an effect skips every test that depends on it. This is powerful but can be surprising. If an effect is shared by many tests, a single marker on that effect gates a large part of the suite. Before adding a marker to a widely-used effect, consider whether the marker belongs on the individual tests instead.

The CLI

From The CLI:

Use --rerun after fixing a failure

When a run has failures and you think you have fixed the issue, use relux run --rerun instead of re-running the full suite. This targets only the tests that failed last time, giving you faster feedback. Once the reruns pass, do a full relux run to confirm nothing else broke.

Match strategy to context

Use --strategy fail-fast during local development — you want to know about the first failure quickly so you can fix it. Use --strategy all in CI — you want a complete picture of the suite’s health, not just the first problem.

Start flakiness investigation with history

When a test starts failing intermittently, run relux history --flaky before digging into the test code. The flakiness rate tells you whether you are dealing with an environment issue (sporadic) or a logic bug (consistent). If the test passes 95% of the time, you are probably looking at a timing issue. If it passes 50% of the time, there may be a race condition or uncontrolled dependency.