Introduction
This document explains the purpose of the Radicle native CI component, the requirements put on it, and its software architecture.
Overview
CI support in Radicle consists of several components. For native CI they are:
- the Radicle node
- the CI broker
- the native CI executable
These all have to run on the same host: the node and broker communicate via a Unix domain socket, and the broker spawns the native CI executable.
See the CI broker architecture documentation for a more in-depth description of CI in Radicle.
The child process is called “the CI adapter” in this document.
Native CI works like this:
- reads a request message from its standard input
- writes a response message saying it starts a run, to its standard output
- clones the git repository in the request
- switches to the commit in the request
- reads the
.radicle/native.yamlfile in the repository - executes the shell snippet in the
.native.yamlfile - writes a response message with the result of the run
- writes a log file based on what it did
- updates the
index.htmlpage that lists all CI runs and their results
Native CI
The diagram above shows the happy path. Various things can go wrong,
after the native CI executable has started. (In this document we don’t
need to consider other possible failures.) The test suite for native
CI verifies that they’re all handled correctly, either by explicitly
testing each case, or relying on analysis that generic error handling
copes with the case. See the test-suite program in the source tree.
- the environment variable specifying the configuration file is not set
- can’t read or parse the configuration file
- the configuration file does not specify all mandatory fields
- the configuration file specifies values that are wrong in some way
- stdin is empty
- stdin does not contain a newline
- the first line of stdin can’t be parsed as a message serialized as JSON
- the message is not a trigger message
- the repository triggered does not exist
- the repository can’t be cloned
- the repository does not have the requested commit
- the repository does not contain
.radicle/native.yaml native.yamlcan’t be read or parsed as YAMLnative.yamldoes not contain a text fieldshell- writing first response to stdout fails
- there is any problem executing the contents of the
shellfield usingbash - executing the shell snippet takes too long
- generating or writing a “run metadata” file fails
- writing second response to stdout fails
- finding or parsing all run metadata files fails
- generating or writing the static web pages listing all runs fails
Requirements
Overall, the native CI engine, or adapter, is very simple. However, it must be robust, which makes things more difficult. Here, robust means that whatever happens, the node owner finds out what it was. If a run fails for whatever reason, the node owner can figure out why. Ideally, this applies to anyone watching CI on the node can see it as well.
In the descriptions of the requirements we use the following roles:
- “developer” makes changes to the repository on which CI is run
- “node owner” runs the node itself
The native CI engine has several ways to report what it does:
- its standard error output
- in a systemd setup this is captured to the system log or journal
- a per-node log file for native CI
- this is for this that interest only the node owner, not the developer
- e.g., finding configuration errors that the developer can’t fix, such a missing configuration file
- a per-run log file
- this of interest to both the node owner and the developer
- this is the primary tool for the developer to figure out what went wrong in their CI run, so that they can change their repository to fix it
Developer can see what status of each CI run on a node
Requirement: The developer can see what CI runs a node has triggered, and what the current status of each is.
Justification: This lets them be reassured that CI is working.
Implementation: Native CI maintains one or more web pages that list every run. For each run, the following is recorded:
Developer gets a useful run log
Requirement: The developer can fetch a useful log of a run that helps them find out problems in their code.
Justification: This is crucial for the developer to have any hope of fixing a problem found in CI.
Implementation: The run log is a static file that can be fetched via HTTP from the node, or viewed in a web browser. The run log contains at least the following information:
- the repository ID
- the repository alias, if one is known to the local node
- the commit id that triggered the run
- the commit diff (
git show) - when the run was triggered
- when the run finished
- the environment variables of the native CI process
- every command or other action that was taken during the run
- the standard output and standard error output, and the exit code, of every command
- whether the run was considered successful or not
Node owner is informed via system log if CI fails early
Requirement: If a native CI run fails early, it writes a message to its standard error output.
Justification: The standard error is captured by systemd, and written to the system log or journal, from where the node owner can be expected to find it. This gives them a chance to find out what’s wrong and hopefully fix it.
“Early” here means any time before the broker has been given a “result” response message, and a per-run log file has been created, and the web page of all CI runs has been updated.
Implementation: Use a suitable Rust logging library, with the default log level allowing only error messages, and only logging an error if something goes wrong early.
Only early failures are logged to the system log
Requirement: The native CI engine only writes to its standard error output when it fails early. Otherwise it only updates its per-run and per-node log files.
Justification: It’s easy to spam the system log with many useless messages, which make it harder to find important information in the log.
The per-node log is updated when an early error occurs
Requirement: If native CI writes an error message to the standard error output, it is also written to the per-node log, with more detail.
Justification: The system log is a bad place to report detailed information, as it’s quite constrained. A per-node log provides more flexibility.
Implementation: Append to a per-node log, and if that fails, report that, too, to the standard error output.
Test architecture
In order to test the native CI engine, we invoke it in various ways, and examine its outputs.
In order for the native CI engine to work, it needs to clone from a node. This is awkward for testing. Using a real node is possible, but introduces more moving parts that can fail during tests. Using a test double, or mock, as the node would be possible, but more work, and it’d be somewhat tricky logic, which is likely to introduce bugs.
We implement the test suite to use a specially set up local node, with a repository with contents for tests. We will create the node as part of the test suite so that it has exactly the content we need for the tests.
---
title: Radicle native CI
subtitle: Requirements and architecture
---
# Introduction
This document explains the purpose of the Radicle native CI component,
the requirements put on it, and its software architecture.
# Overview
CI support in Radicle consists of several components. For native CI
they are:
* the Radicle node
* the CI broker
* the native CI executable
These all have to run on the same host: the node and broker
communicate via a Unix domain socket, and the broker spawns the native
CI executable.
See the CI broker architecture documentation for a more in-depth
description of CI in Radicle.
The child process is called "the CI adapter" in this document.
Native CI works like this:
* reads a request message from its standard input
* writes a response message saying it starts a run, to its standard
output
* clones the git repository in the request
* switches to the commit in the request
* reads the `.radicle/native.yaml` file in the repository
* executes the shell snippet in the `.native.yaml` file
* writes a response message with the result of the run
* writes a log file based on what it did
* updates the `index.html` page that lists all CI runs and their
results
## Native CI

The diagram above shows the happy path. Various things can go wrong,
after the native CI executable has started. (In this document we don't
need to consider other possible failures.) The test suite for native
CI verifies that they're all handled correctly, either by explicitly
testing each case, or relying on analysis that generic error handling
copes with the case. See the `test-suite` program in the source tree.
* the environment variable specifying the configuration file is not
set
* can't read or parse the configuration file
* the configuration file does not specify all mandatory fields
* the configuration file specifies values that are wrong in some way
* stdin is empty
* stdin does not contain a newline
* the first line of stdin can't be parsed as a message serialized as
JSON
* the message is not a trigger message
* the repository triggered does not exist
* the repository can't be cloned
* the repository does not have the requested commit
* the repository does not contain `.radicle/native.yaml`
* `native.yaml` can't be read or parsed as YAML
* `native.yaml` does not contain a text field `shell`
* writing first response to stdout fails
* there is any problem executing the contents of the `shell` field
using `bash`
* executing the shell snippet takes too long
* generating or writing a "run metadata" file fails
* writing second response to stdout fails
* finding or parsing all run metadata files fails
* generating or writing the static web pages listing all runs fails
# Requirements
Overall, the native CI engine, or adapter, is very simple. However, it
must be robust, which makes things more difficult. Here, robust means
that whatever happens, the node owner finds out what it was. If a run
fails for whatever reason, the node owner can figure out why. Ideally,
this applies to anyone watching CI on the node can see it as well.
In the descriptions of the requirements we use the following roles:
* "developer" makes changes to the repository on which CI is run
* "node owner" runs the node itself
The native CI engine has several ways to report what it does:
* its standard error output
- in a systemd setup this is captured to the system log or journal
* a per-node log file for native CI
- this is for this that interest only the node owner, not the
developer
- e.g., finding configuration errors that the developer can't fix,
such a missing configuration file
* a per-run log file
- this of interest to both the node owner and the developer
- this is the primary tool for the developer to figure out what went
wrong in their CI run, so that they can change their repository to
fix it
## Developer can see what status of each CI run on a node
_Requirement:_ The developer can see what CI runs a node has
triggered, and what the current status of each is.
_Justification:_ This lets them be reassured that CI is working.
_Implementation:_ Native CI maintains one or more web pages that list
every run. For each run, the following is recorded:
## Developer gets a useful run log
_Requirement:_ The developer can fetch a useful log of a run that
helps them find out problems in their code.
_Justification:_ This is crucial for the developer to have any hope of
fixing a problem found in CI.
_Implementation:_ The run log is a static file that can be fetched via
HTTP from the node, or viewed in a web browser. The run log contains
at least the following information:
* the repository ID
* the repository alias, if one is known to the local node
* the commit id that triggered the run
* the commit diff (`git show`)
* when the run was triggered
* when the run finished
* the environment variables of the native CI process
* every command or other action that was taken during the run
* the standard output and standard error output, and the exit code, of
every command
* whether the run was considered successful or not
## Node owner is informed via system log if CI fails early
_Requirement:_ If a native CI run fails early, it writes a message to
its standard error output.
_Justification:_ The standard error is captured by systemd, and
written to the system log or journal, from where the node owner can be
expected to find it. This gives them a chance to find out what's wrong
and hopefully fix it.
"Early" here means any time before the broker has been given a
"result" response message, and a per-run log file has been created,
and the web page of all CI runs has been updated.
_Implementation:_ Use a suitable Rust logging library, with the
default log level allowing only error messages, and only logging an
error if something goes wrong early.
## Only early failures are logged to the system log
_Requirement:_ The native CI engine only writes to its standard error
output when it fails early. Otherwise it only updates its per-run and
per-node log files.
_Justification:_ It's easy to spam the system log with many useless
messages, which make it harder to find important information in the
log.
## The per-node log is updated when an early error occurs
_Requirement:_ If native CI writes an error message to the standard
error output, it is also written to the per-node log, with more
detail.
_Justification:_ The system log is a bad place to report detailed
information, as it's quite constrained. A per-node log provides more
flexibility.
_Implementation:_ Append to a per-node log, and if that fails, report
that, too, to the standard error output.
# Test architecture
In order to test the native CI engine, we invoke it in various ways,
and examine its outputs.

In order for the native CI engine to work, it needs to clone from a
node. This is awkward for testing. Using a real node is possible, but
introduces more moving parts that can fail during tests. Using a test
double, or mock, as the node would be possible, but more work, and
it'd be somewhat tricky logic, which is likely to introduce bugs.
We implement the test suite to use a specially set up local node, with
a repository with contents for tests. We will create the node as part
of the test suite so that it has exactly the content we need for the
tests.