Radish alpha
r
rad:z3trNYnLWS11cJWC6BbxDs5niGo82
Radicle Improvement Proposals (RIPs)
Radicle
Git
RIP 4: Canonical References
Draft fintohaps opened 1 year ago
1 file changed +485 -0 329dee9a 6a41c3a8
added 0004-canonical-references.md
@@ -0,0 +1,485 @@
+
---
+
RIP: 4
+
Title: Canonical References
+
Author: '@fintohaps <fintan.halpenny@gmail.com>'
+
Status: Draft
+
Created: 2024-06-20
+
License: CC0-1.0
+
---
+

+
RIP #4: Canonical References
+
===========================
+

+
Overview
+
--------
+
This RIP aims to extend the functionality of creating and updating canonical
+
references in the Radicle storage layer[^0]. The current system is limited to
+
only computing a single canonical reference, the "default branch" of a
+
project[^1]. This proposal aims to define a way to generalise this system to
+
apply to all references. It defines a schema for rules that can be used to
+
compute canonical references. It will recommend a way to extend the existing
+
identity system to incorporate the new rules mechanism, in a
+
backwards-compatible fashion. Note that it will not define how to compute the
+
tip for the canonical reference.
+

+
Table of Contents
+
-----------------
+

+
# Table of Contents
+

+
1. [Overview](#overview)
+
2. [Background](#background)
+
   - [Default Branch](#default-branch)
+
3. [Canonical Reference Schema](#canonical-reference-schema)
+
   - [Validation](#validation)
+
4. [Setting Canonical References](#setting-canonical-references)
+
   - [Identity Reference](#identity-reference)
+
5. [Introducing the Schema](#introducing-the-schema)
+
   - [Threshold](#threshold)
+
   - [Version](#version)
+
   - [Canonical Reference Rules](#canonical-reference-rules)
+
   - [Identity Validation](#identity-validation)
+
6. [Future Work](#future-work)
+
7. [Appendix](#appendix)
+
   - [JSON Schema](#json-schema)
+
   - [Examples](#examples)
+

+
Background
+
--------------
+

+
### Default Branch
+
To provide some context for this proposal, let's first discuss the existing
+
mechanism for calculating the "default branch" for a project repository. A
+
project repository is one that defines a payload in its identity document, under
+
the key identifier `xyz.radicle.project`. It contains a field `defaultBranch`,
+
which is the mainline branch that delegates are expected to reach quorum on
+
while developing. This is usually a `master` or `main` branch.
+

+
In the Heartwood protocol, each node pushes to their own fork in the Radicle
+
storage -- under its own `refs/namesapces`. In our own terms, we say that a
+
reference is canonical if it exists under the top-level `refs`, as opposed to
+
`refs/namespaces`. For example, a delegate would push to their `main` branch
+
under `refs/namespaces/deadbeef/refs/heads/main`, then the canonical version of
+
this would live under `refs/heads/main`.
+

+
The commit, for the canonical reference to point to, is calculated based on the
+
set of delegates and some threshold. Both of these values are currently defined
+
at the top-level of the identity document under `delegates` and `threshold`,
+
respectively.
+

+
This proposal takes this approach and generalises it for any reference to be
+
made canonical. For example, projects often want `refs/tags` to exist at the
+
top-level of the reference storage for distributing release versions.
+

+
Canonical Reference Schema
+
--------------------------
+
In this section we introduce the definition of a single rule for canonical
+
references. Using the intuition we established from the [Default
+
Branch](#default-branch) section, three components are needed. The first is the
+
reference name – and to generalise this further we will use a `refspec-pattern`
+
so that we can capture whole categories of references. The second is the set of
+
allowed DIDs that are used to determine the quorum for the commit for the
+
canonical reference. The third, and final, component is the threshold required
+
to reach a quorum amongst the allowed set of DIDs.
+

+
Thus, we define a rule as a JSON object consisting of the following two keys:
+

+
- `threshold`: the number of "votes" required for promoting a reference tip to
+
  be canonical -- based on the `allow` tips.
+
- `allow`: the DIDs whose tips can be inspected for computing the canonical tip
+
  for the given reference. **Note** that these do not need to be the same as the
+
  identity documents delegates.
+

+
The key to this object is the reference name pattern that this rule applies to.
+
The format must follow `git-check-ref-format`[^2] using the
+
`--refspec-pattern`[^4] option. This allows for references to be specified using
+
a fully-qualified path, e.g. `refs/heads/main`, or a set of references under a
+
hierarchy using a glob star (`*`), e.g. `refs/tags/release/*`.
+

+
To make the mechanism more configurable, each reference has their own
+
`threshold` and `allow`. This is to ensure that we do not lock developers into
+
specific workflows and allow them to define how their project's canonical
+
references are set. To make it easier to configure, we also use a special symbol
+
`delegates` which means that the `delegates` of the identity payload are used
+
for the `allow` set of DIDs. This will not apply to `threshold`, which will be
+
discussed further in [Introducing the Schema](#introducing-the-schema).
+

+
### Validation
+
The JSON Schema[^3] for canonical references, which can be used for
+
validation, is provided below in [JSON Schema](#json-schema), and in this
+
section we will describe the validation in prose.
+

+
The key to a rule must be a reference name pattern and must be a string that
+
conforms to `git-check-ref-format --refspec-pattern`, whose length is in the
+
range `[1, 255]`.
+

+
The fields, `threshold` and `allow` are mandatory fields.
+

+
- `threshold`: must either be an integer in the range `[1, 255]`.
+
- `allow`: must either be an array of strings, where the length of the array is
+
  in the range `[1, 255]`, or the constant `delegates`. If the value is an
+
  array, then the maximum length of the array should be equal to `threshold`.
+
  For each delegate in the array the string must be a valid DID in the Radicle
+
  system, e.g. `did:key`.
+

+
To ensure that future additions to the rules are compatible, any extra fields
+
must be parsed but ignored. The implementation may want to restrict the length
+
of this data when parsing, to ensure that large amounts of data are not stored.
+

+
#### Priority of Applying Rules
+

+
When adding the set of rules to the repository, it is important to note that
+
each rule, for a given reference, will be unique – due to the reference name
+
pattern being the key to the rule object. We note, however, that since we allow
+
the `*` component in the reference name pattern, there may be an overlap in
+
which rule applies to a given reference name. The decision for what rule to
+
apply in these cases is determined by the *specificity* of the reference pattern
+
name.
+

+
For example, if there are three rules, `refs/tags/v1.0`, `refs/tags/*`, and
+
`refs/*`, we say that the rule is chosen in the following order:
+

+
    refs/tags/v1.0 < refs/tags/* < refs/*
+

+
Where `<` is more specific – this is so that when rules are sorted, more
+
specific patterns appear earlier. To illustrate further, if the reference we are
+
checking is `refs/tags/v1.0`, then the first rule will apply. If the reference
+
is `refs/tags/v2.0`, then the `refs/tags/*` rule will apply. Finally, if the
+
reference is `refs/heads/trunk`, then the `refs/*` will apply.
+

+
A more complicated example is when the `*` component is in the middle of the
+
reference name pattern, e.g. `refs/tags/*/v1.0`. Given rules for
+
`refs/tags/*/v1.0`, `refs/tags/releases/*`, and `refs/tags/*` we should expect
+
the following cases to match the given rule:
+

+
| Reference Name            | Matching Rule          |
+
| ------------------------- | ---------------------- |
+
| `refs/tags/v1.0`          | `refs/tags/*`          |
+
| `refs/tags/releases/v1.0` | `refs/tags/releases/*` |
+
| `refs/tags/qa/v1.0`       | `refs/tags/*/v1.0`     |
+

+
##### Priority Specification
+

+
First we define the number of components for a pattern as the number of occurrences
+
of the symbol `/` in the given pattern, and is represented as `components(φ)`
+
where `φ` is the pattern.
+

+
We also define `φ[i]` as the path component of `φ` at index `i`.
+

+
For two patterns `φ` and `ψ`, we say that "`φ` is more specific than `ψ`",
+
denoted `φ < ψ` if:
+

+
1. `components(φ) > components(ψ)`
+
   The justification is, that references might be interpreted as a hierarchy where
+
   a match on more components would mean a match at a lower level in the
+
   hierarchy, thus being more specific. Imagine a hierarchy that maps to
+
   a corporate hierarchy. The pattern "department-1" matches all references that
+
   are administered by a particular department, and thus is not very specific.
+
   To contrast, the pattern "department-1/team-a/project-i/nice-feature" is very
+
   specific as it matches all refnames that relate to the development of a
+
   particular feature for a particular project by a particular team. Note that
+
   this would also apply when the connection between the `φ` and `ψ` is not as
+
   obvious, e.g. also `a/b/c/d/* < */x`.
+
2. We say that `φ[i] < ψ[i]` if:
+
     a. `φ[i]` does not contain an asterisk and `ψ[i]` contains an asterisk,
+
        i.e. the symbol `*`, e.g. `a < * and abc < a*`.
+
        Note that this is important to capture specificity across
+
        components, i.e. to conclude that `a/b/* < a/*/c`.
+
     b. Both `φ[i]` and `ψ[i]` contain an asterisk.
+
         A. The asterisk in `φ[i]` is further right than the asterisk in `φ[i]`,
+
            e.g. `aa* < a*`.
+
         B. The asterisk in `φ[i]` and `ψ[i]` is equally far to the right,
+
            and `φ[i]` is longer than `ψ[i]`, e.g. `a*b < a*`.
+
3. Otherwise, fall back to a lexicographic ordering.
+

+
Note that for items 1. and 2., one may assume that `components(φ) = components(ψ)`.
+

+
Here are examples, with the justification rule in parentheses:
+

+
    refs/tags/release/candidates/* <(1.)   refs/tags/release/* <(1.) refs/tags/*
+
    refs/tags/v1.0                 <(2.a.) refs/tags/*
+
    refs/heads/*                   <(3.)   refs/tags/*
+
    refs/heads/main                <(3.)   refs/tags/v1.0
+

+
Setting Canonical References
+
---------------------------
+
In this section we will recommend how the canonical reference rules should be
+
used, without defining how to compute the commit SHA that the reference should
+
point to.
+

+
When the commit SHA is to be calculated for a given reference, the name of the
+
reference is matched against the set of rules, where the correct priority is
+
used (see [Priority of Applying Rules](#priority-of-applying-rules)). If there
+
is a corresponding rule, the `allow` set is consulted for the set of DIDs to
+
use. The set of references to consult for the final SHA can thus be calculated
+
by combining the given set of DIDs and the reference name being matched – as per
+
the storage layout[^0].
+

+
Once all the commits are found for each DID, the agreed commit must pass the
+
specified `threshold` in the rule. Only then can the canonical reference be set
+
to the agreed upon commit and promoted to the top-level `refs` namespace[^5].
+

+
---
+

+
**Note**: this works under the assumption of the current layout of the storage
+
where a single DID, `did:key`, corresponds to a single namespace. This is
+
subject to change in the future when the protocol begins to support multiple
+
devices and unifying them under a single DID. This would mean that commits on
+
multiple devices may contribute only as a single vote, rather than multiple
+
votes. For example, this may be a recursive calculation of canonical references,
+
where the DID needs to calculate its commit, to then contribute to the vote of
+
the overall project.
+

+
---
+

+
### Handling `rad` References
+

+
There are currently a set of references, `refs/rad/id`, `refs/rad/root`, and
+
`refs/rad/sigrefs` that are special references. These references must not be
+
allowed to have rules, thus any rule that is prefixed `refs/rad` should be
+
rejected from being persisted to the final set of rules.
+

+
This may subject to change in a future RIP – or edit of this RIP – if it becomes
+
the case that using the canonical references mechanism becomes an appropriate
+
way of setting certain special references.
+

+
Introducing the Schema
+
----------------------
+
In this section, we will recommend a path for introducing canonical reference
+
rules to the existing identity document. We opt to introduce the rules to the
+
top-level of the identity document. We do this, as opposed to introducing it via
+
`xyz.radicle.project` or a new payload, since these rules are general enough to
+
apply to any kind of repository.
+

+
They will be introduced underneath the key `canonicalRefs` which will, for now,
+
consist of an object with a single key `rules`. The associated value for the
+
`rules` key is an object consisting of all the rules (defined in [Canonical
+
Reference Schema][#canonical-reference-schema]).
+

+
### Threshold
+
Since the `threshold` field, in the identity document, was originally defined to
+
be used more generally, but ended up being used specifically for the default
+
branch, we opt to remove this when `canonicalRefs` is present in the document.
+

+
### Version
+
To preserve backwards-compatibility, we have decided to introduce a `version`
+
field to the identity document. Since the initial version of the document does
+
not contain this, any document to be missing this field is assumed to be
+
`version: 1`. Otherwise, this change introduces `version: 2`. The `threshold`
+
field may be parsed in the case of a `version: 1` document, and to preserve its
+
functionality, it is interpreted as the `threshold` of canonical reference rule
+
for the `defaultBranch` – where the identity's `delegates` are used as the
+
`allow` of the rule.
+

+
### Canonical Reference Rules
+
In a `version: 2` document, we introduce the `canonicalRefs` object. It is
+
optional, implying that there are no rules. It is recommended to set a rule for
+
the default branch when the repository is a project. The absence of a canonical
+
default branch rule will mean that errors and warnings may occur in the protocol
+
and the surrounding tooling.
+

+
Note that since the rules are optional, this means that a `version: 1` document
+
may still be parsed into the format for `version: 2`.
+

+
### Identity Validation
+
With the above, we need to redefine the validation of the identity document. The
+
validation schema in RIP-2[^7] is extended to optionally include the
+
`canonicalRefs` and `version` field, and make `threshold` mutually exclusive
+
with `version` and `canonicalRefs`.
+

+
```json
+
{
+
  "$schema": "https://json-schema.org/draft/2020-12/schema",
+
  "type": "object",
+
  "properties": {
+
    "delegates": {
+
      "type": "array",
+
      "items": { "type": "string" },
+
      "minItems": 1,
+
      "maxItems": 255,
+
      "uniqueItems": true
+
    },
+
    "threshold": {
+
      "type": "integer",
+
      "minimum": 1,
+
      "maximum": 255
+
    },
+
    "payload": {
+
      "type": "object",
+
      "additionalProperties": { "type": "object" },
+
      "minProperties": 1
+
    },
+
    "version": {
+
      "type": "integer",
+
      "minimum": 2
+
    },
+
    "canonicalRefs": { "$ref": "/rules" } // Rules schema defined in JSON Schema
+
  },
+
  "required": ["delegates", "payload"],
+
  "allOf": [
+
    {
+
      "if": { "required": ["canonicalRefs"] },
+
      "then": { "required": ["version"] }
+
    }
+
  ]
+
}
+
```
+

+
Future Work
+
-----------
+
We may want to extend the permissions of this system in the future to provide
+
more fine-grained control of repository artifacts. For example, we may want to
+
explore the use of UCANs[^6] and override mechanisms that do not require the
+
`threshold` to be met (in case someone meets the terrible fate of a real bus
+
factor accident).
+

+
Something we considered but did not specify here is being able to define groups
+
for delegate sets, so that group identifiers can be used in place of explicit
+
sets.
+

+
Another aspect discussed was to consider more complex `threshold` schemes that
+
can support fractional voting, inspired by the KERI specification[^8].
+

+
Finally, as mentioned in [Setting Canonical
+
References][#setting-canonical-references], we will have to consider what
+
canonical references mean in the case of more sophisticated DIDs, in comparison
+
to `did:key`. This will likely be discussed in a RIP that discusses multi-device
+
support within the protocol.
+

+
Appendix
+
--------
+

+
### JSON Schema
+
```json
+
{
+
  "$schema": "http://json-schema.org/draft-07/schema#",
+
  "title": "Canonical References",
+
  "description": "A mapping of canonical references with specific thresholds and allowed delegates.",
+
  "type": "object",
+
  "patternProperties": {
+
    "^refs/.*$": {
+
      "type": "object",
+
      "properties": {
+
        "threshold": {
+
          "oneOf": [
+
            {
+
              "type": "integer",
+
              "minimum": 1,
+
              "maximum": 255
+
            },
+
            {
+
              "type": "string",
+
              "const": "delegates"
+
            }
+
          ]
+
        },
+
        "allow": {
+
          "oneOf": [
+
            {
+
              "type": "array",
+
              "minItems": 1,
+
              "maxItems": 255,
+
              "uniqueItems": true,
+
              "items": {
+
                "type": "string"
+
              }
+
            },
+
            {
+
              "type": "string",
+
              "const": "delegates"
+
            }
+
          ]
+
        }
+
      },
+
      "required": ["threshold", "allow"],
+
      "allOf": [
+
        {
+
          "if": {
+
            "properties": {
+
              "threshold": {
+
                "type": "integer"
+
              },
+
              "allow": {
+
                "type": "array"
+
              }
+
            }
+
          },
+
          "then": {
+
            "properties": {
+
              "threshold": {
+
                "type": "integer",
+
                "maximum": {
+
                  "$data": "1/allow/length"
+
                }
+
              }
+
            }
+
          }
+
        }
+
      ]
+
    }
+
  },
+
  "additionalProperties": false
+
}
+
```
+

+
#### Examples
+
```json
+
{
+
  "refs/heads/main": {
+
    "threshold": 2,
+
    "allow": [
+
      "did:key:z6MkpQTLwr8QyADGmBGAMsGttvWzP4PojUMs4hREZW5T5E3K",
+
      "did:key:z6MknG1nYDftMYUQ7eTBSGgqB2PL1xK5Pif33J3sRym3e8ye"
+
    ]
+
  },
+
  "refs/tags/releases/*": {
+
    "threshold": 3,
+
    "allow": [
+
      "did:key:z6MknLWe8A7UJxvTfY36JcB8XrP1KTLb5HFTX38hEmdY3b56",
+
      "did:key:z6Mkq2E5Se5H9gk1DsL1EMwR2t4CqSg3GFkNN2UeG4FNqXoP",
+
      "did:key:z6MkqRmXW5fbP9hJ1Y8j2N4CgVdJ2XJ6TsyXYf3FQ2NJgXax"
+
    ]
+
  }
+
}
+
```
+

+
```json
+
{
+
  "refs/heads/development": {
+
    "threshold": "delegates",
+
    "allow": "delegates"
+
  },
+
  "refs/heads/release/*": {
+
    "threshold": 2,
+
    "allow": [
+
      "did:key:z6MkhH7ENYE62JAjTiRZPU71MGZ6xCwnbyHHWfrBu3fr6PVG"
+
    ]
+
  }
+
}
+
```
+

+
```json
+
{
+
  "refs/tags/v1.0": {
+
    "threshold": 2,
+
    "allow": [
+
      "did:key:z6Mkn3kFsaHYZtMBWh4Fs1ZbW8KwF4xnGFeaY2R7YK4vMQLx",
+
      "did:key:z6Mknq7FM5F4QMb56nLZ4YTChcHfA1fQg3qRAABv8mE8H4fK"
+
    ]
+
  },
+
  "refs/tags/v2.0": {
+
    "threshold": "delegates",
+
    "allow": "delegates"
+
  }
+
}
+
```
+

+
[^0]: https://app.radicle.xyz/nodes/seed.radicle.garden/rad:z3trNYnLWS11cJWC6BbxDs5niGo82/tree/0003-storage-layout.md
+
[^1]: https://app.radicle.xyz/nodes/seed.radicle.garden/rad:z3trNYnLWS11cJWC6BbxDs5niGo82/tree/0002-identity.md#repository-identity
+
[^2]: https://git-scm.com/docs/git-check-ref-format
+
[^3]: https://json-schema.org
+
[^4]: https://git-scm.com/docs/git-check-ref-format#Documentation/git-check-ref-format.txt---refspec-pattern
+
[^5]: https://app.radicle.xyz/nodes/seed.radicle.garden/rad:z3trNYnLWS11cJWC6BbxDs5niGo82/tree/0003-storage-layout.md#layout
+
[^6]: https://github.com/ucan-wg
+
[^7]: https://app.radicle.xyz/nodes/seed.radicle.garden/rad:z3trNYnLWS11cJWC6BbxDs5niGo82/tree/0002-identity.md#validation
+
[^8]: https://trustoverip.github.io/tswg-keri-specification/#fractionally-weighted-threshold