Radish alpha
r
rad:z3trNYnLWS11cJWC6BbxDs5niGo82
Radicle Improvement Proposals (RIPs)
Radicle
Git
Add RIP #1 draft
Alexis Sellier committed 3 years ago
commit 8cd1561688462e2fcb234956b6f5916c42bdf4fa
parent 549e036
1 file changed +610 -0
added 0001-heartwood.md
@@ -0,0 +1,610 @@
+
---
+
RIP: 1
+
Title: Heartwood
+
Author: '@cloudhead <cloudhead@radicle.xyz>'
+
Status: Draft
+
Created: 2022-09-06
+
License: CC0-1.0
+
---
+

+
RIP #1: Heartwood
+
=================
+
In this RIP, we define a major iteration of the Radicle network protocol and
+
the various related sub-systems. We call it "Heartwood".
+

+
The intent of this proposal is not to define a complete specification of the
+
protocol, but to be a foundation for subsequent RIPs. Various aspects
+
of the protocol, in particular around the issues of privacy, censorship
+
resistance, peer misbehavior and DoS are left for future RIPs to expand on,
+
and won't be tackled here. Additionally, details and specifics on the wire
+
protocol, message formats, hash functions and encodings are deliberately
+
left out, to focus this proposal on the big ideas.
+

+
> This document requires little to no prior knowledge of existing Radicle
+
protocols; it is written with the intent of being complete and self-contained.
+

+
Overview
+
--------
+
The Radicle network protocol can be defined through the primary use-case of a
+
peer-to-peer code hosting network:
+

+
*Alice publishes a repository on the network under a unique identifier, and
+
Bob, using that identifier is able to retrieve it and verify its authenticity*.
+

+
The above must hold true independent of the network topology and number of
+
nodes hosting the project, as long as at least one node is. We can therefore
+
say that the primary function of the protocol is to locate repositories on
+
the network, and serve them to users, all in a timely, and resource-efficient
+
manner.
+

+
This functionality can be broken down into three components:
+

+
1. Locating repositories, ie. finding which nodes host a given repository
+
2. Replicating a given repository between two nodes, such that they both
+
   hold a copy
+
3. Verifying the authenticity of all the data retrieved from the network, such
+
   that any node can serve any data without the need for trust
+

+
To achieve (1), nodes need to exchange information about which repositories
+
they host, so that they can point users to the right locations, as well as
+
notify each other when there are updates to the repositories. This in turn
+
requires *peer* discovery: nodes need a way to find each other on the network.
+

+
To achieve (2), git is used for its excellence as a replication protocol.
+

+
To achieve (3), given the nature of peer-to-peer networks, ie. that any node
+
can join the network, git alone is not enough. Replication through git needs to
+
be paired with a way of verifying the authenticity of the data being
+
replicated. While git checks for data *integrity*, our protocol will have to
+
make sure that the data Bob downloads is the data Alice published on the
+
network. Without such verification, an intermediary node on the network could
+
easily tamper with the data before serving it to Bob. We also can't require
+
that Alice serve her data directly to Bob, as that would require them to be
+
online at the same time, and would introduce a single point of failure.
+

+
Table of Contents
+
-----------------
+
* [Repository Identity](#repository-identity)
+
* [Repository Discovery](#repository-discovery)
+
    * [Topology](#topology)
+
    * [Routing](#routing)
+
* [Node Identity](#node-identity)
+
* [Gossip](#gossip)
+
    * [Inventory Announcements](#inventory-announcements)
+
        * [Pruning](#pruning)
+
    * [Reference Announcements](#reference-announcements)
+
    * [Node Announcements](#node-announcements)
+
        * [Bootstrap Nodes](#bootstrap-nodes)
+
* [Replication](#replication)
+
    * [Project Tracking and Branches](#project-tracking-and-branches)
+
    * [Unintentional Forks and Conflicts](#unintentional-forks-and-conflicts)
+
* [Storage](#storage)
+
    * [Layout](#layout)
+
        * [Special References](#special-references)
+
* [Canonicity](#canonicity)
+
* [Closing Thoughts](#closing-thoughts)
+
* [Credits](#credits)
+
* [Copyright](#copyright)
+

+
Repository Identity
+
-------------------
+
To locate, or even "talk" about repositories on a peer-to-peer network, we
+
require a stable, unique identifier that can be verifiably associated with a
+
repository. Without this, there is no way for a user to request a specific
+
repository and verify its authenticity. Unlike centralized forges such
+
as GitHub, where repositories are deemed authentic based on their *location*,
+
eg. `https://github.com/bitcoin/bitcoin`; in an *untrusted* network, location
+
is not enough and we need a way to automatically verify the data we get from any
+
given location. Therefore, before we talk about networking, we must make a
+
little detour into repository identity.
+

+
It's important to understand that although git repositories use content
+
addressing for their objects, repositories are *mutable* data-structures.
+
Therefore, the identity of a repository *cannot* be derived solely from its
+
contents. Instead, the identity must be determined by some other authority.
+
In Radicle, this is no other than the *maintainers* of the repository, since it
+
is their mandate to decide what gets merged into a codebase.
+

+
We can then define a repository's identity as the set of all branches and tags
+
that the maintainers of the repository agreed upon.
+

+
For anyone to be able to verify this, we require maintainers to provide a
+
cryptographic signature over the repository's heads, tags, and other relevant
+
git references. We call this the *signed refs*. Signed refs can be updated
+
whenever there are changes to a repository that are accepted by maintainers.
+

+
The last question to answer is *who* are the maintainers, and how are they
+
determined?
+

+
Before a repository can be published to the network, it needs to be converted
+
into a Radicle *project*. A project is simply a repository with an associated
+
*identity document*. In this document, the public keys of the repository's
+
current maintainers are stored. When a project is initialized from an existing
+
git repository for the first time, the user initializing becomes the de-facto
+
initial maintainer of the project, and his key is included in the new identity
+
document. We call this set of trusted keys the project's *delegation*, and each
+
key is called a *delegate*. Though these will often map one to one with
+
maintainers, this is not a requirement. The only requirement is that they be
+
trusted in the context of a given project, as they will be used to determine
+
the canonical state of the project's repository.
+

+
From this initial identity document we can then derive a unique, stable
+
identifier for the project, by hashing the document's contents. To ensure the
+
uniqueness of this identifier, in addition to the *delegation*, we include in
+
the document a user-chosen *alias* and *description* for the project.
+

+
It is by including the identity document in the *signed refs* that we establish
+
a relationship between the source code and the identity, and thus associate
+
the project identifier with the project source code. Note that this permits
+
identical source codes to have more than one identity. This is useful when
+
a user wishes to a *fork* a repository. In that case, a new project would
+
be initialized with a brand new identifier.
+

+
The storage, update and verification mechanism for the identity document
+
will be discussed in more detail in a future RIP. For the purposes of this
+
document, we can assume a verification mechanism that takes as input the project
+
identifier, signed refs, and identity document, and outputs whether the project
+
is valid or not.
+

+
At the networking level, all we need is a way to derive stable identifiers for
+
repositories, and a verification process that asserts that a given repository
+
corresponds to some project identifier.
+

+
Repository Discovery
+
--------------------
+
The Radicle network protocol has to serve one core purpose: given a project
+
identifier, a user should be able to retrieve the full project source code
+
associated with that identifier, and verify its authenticity. This function
+
should be independent of where the project is located, and how many replicas
+
exists, provided at least one replica exists.
+

+
### Topology
+

+
Given that there is no natural incentive for nodes to host *arbitrary* projects,
+
nodes on the network should be given the choice of which projects to host.
+

+
For example,
+

+
* A company that uses or provides open source software may want to host it on
+
  their node, to ensure its continued availability on the network.
+
* A business that hosts projects for a fee would need to be able to choose
+
  which projects it hosts.
+
* A developer contributing to a project may want to self-host it on his node.
+

+
For this reason, we cannot deterministically compute on which node(s)
+
a given project should be hosted, as is the case with DHTs. Nodes are able
+
to choose what they host, and therefore the network is fundamentally
+
*unstructured*. Some nodes may host thousands of projects, while others may
+
host only one or two. Though there is a benefit to arranging the peer
+
topology in a certain fashion (eg. to reduce communication overhead), this
+
cannot be relied on in an untrusted network, and therefore we don't make
+
these assumptions in the basic protocol either.
+

+
### Routing
+

+
The general problem of reaching a specific node on the network is usually
+
known as "routing". Where IP routing tries to route traffic to a certain
+
IP address, in the Radicle network, we attempt to route requests to
+
one or more nodes that host a given project; these are called *seeds*
+
in the context of that project. A seed is a node that hosts and serves
+
one or more projects on the network.
+

+
Routing information is usually stored in a *routing table* that is keyed
+
by the "target", in our case this is the project identifier:
+

+
    RoutingTable = BTreeMap<ProjectId, Vec<NodeId>>
+

+
For each project, we keep track of the nodes that are known to host this
+
project. Using URNs for project identifiers and IP addresses as node
+
identifiers, the table might look something like:
+

+
    rad:jnrx6t…mfdyy         54.122.99.1, 89.2.23.67
+
    rad:gi59bk…l2jfi         66.12.193.8, 89.2.23.67, 12.43.212.9
+
+

+
To build the table, nodes gossip information about other nodes, namely *what*
+
projects are hosted *where*.
+

+
Assuming `32 byte` project and node identifiers, an average of `3` nodes
+
hosting each project, and a million projects, all stored in a binary tree,
+
we would need only about `244 MB` to store the entire routing table in memory
+
with no compression:
+

+
    project count = 1'000'000
+
    leaf size = 32 + 32 * 3 = 128 B
+
    leaves size = leaf size * project count = 128'000'000 B = ~122 MB
+
    index size = leaf size * (project count - 1) = ~122 MB
+
    total = 122 MB + 122 MB = ~244 MB
+

+
Since this amount of memory is available on commodity hardware, we see
+
no need to partition the routing table for the time being, and propose
+
that each node store the entire routing table on disk or in memory.
+

+
Node Identity
+
-------------
+
The identity of a node on the network is simply the identity of the user
+
operating the node. To be able to securely verify data authorship, we use
+
public key cryptography, with the public key being used as the node identifier.
+
In the case of nodes run by end-users; which is likely most nodes; the
+
node's secret key is used for *signed refs* and optionally to sign git commits.
+

+
The use of the same identity for both network communications and code signing
+
makes the network more transparent, while allowing nodes to connect to each
+
other based on a social graph.
+

+
For nodes that are run as always-on "servers", the node identity may not be
+
used for signing code. These *seed* nodes only use their secret keys to
+
sign gossip messages and establish secure connections.
+

+
Gossip
+
------
+
We design the Radicle networking layer as a *gossip* protocol. In this proposal,
+
we go over some of the fundamental types of messages that are sent between
+
peers over the network. We contend that the core functionality can be achieved
+
with three message types: *inventory* announcements, *reference* announcements
+
and *node* announcements. Each fulfilling a distinct role. The exact wire
+
protocol will be described in a future proposal; this section should serve
+
as a short introduction to the topic.
+

+
### Inventory Announcements
+

+
To build their routing table, nodes connecting to the network announce to their
+
peers what inventory they have, ie. what projects they are seeding via a *gossip*
+
protocol. These announcements are relayed to other connected peers, and so
+
on until they reach the majority of nodes on the network. Messages that have
+
already been seen are dropped, to prevent messages from propagating forever.
+
Gossip messages may be retained by nodes for a certain amount of time, so that
+
they can be served to new nodes joining the network. This *inventory
+
announcement* message has the following shape:
+

+
    InventoryAnnouncement = (
+
        NodeId,
+
        Vec<ProjectId>,
+
        Timestamp,
+
        Signature,
+
    )
+

+
It contains the identifier of the node making the announcement, the inventory
+
of projects, a timestamp, and the signature of the node over the projects and
+
timestamp. By using a public key as the `NodeId`, we can then both identify
+
and verify the provenance of the message, using the signature.
+

+
In this manner, every node in the network will eventually converge
+
towards a single routing table, provided the network is well connected.
+

+
For larger networks, where nodes cannot be fully meshed, it's desirable for
+
seeds that have projects in common to be connected to each other. Hence,
+
nodes should prioritize connecting to peers that seed the same projects as
+
them. This is simply because relevant messages can reach interested nodes
+
more quickly and efficiently if nodes with shared interests are directly
+
connected to each other; but also because nodes can use already-established
+
connections to fetch data of interest.
+

+
As is often the case with large, unstructured networks, gossip messages can be
+
received out of order. For this reason, the inventory message includes a
+
*timestamp*, which is used for ordering messages. Since the inventory message
+
is meant to communicate a node's complete inventory, nodes can simply ignore
+
inventory messages with timestamps lower than the latest received, and not
+
relay them. To mitigate issues with timestamps far in the future, we reject
+
messages with timestamps too far in the future.
+

+
#### Pruning
+

+
One worry with routing tables on permissionless networks is that nodes come and
+
go all the time. While a project may be available on a node one day, the node
+
may go offline the next day and never come back online. Additionally, nodes
+
may choose to stop hosting a certain project, making it no longer available.
+

+
Hence, the routing table needs to be constantly pruned, with out-of-date
+
entries evicted. To achieve this, we set an expiry on routing table entries,
+
and require live nodes to "refresh" their entries on other nodes by sending
+
`inventory` messages periodically.
+

+
Entries that have been in the table for more than a day without updates or
+
refreshes can then be automatically evicted.
+

+
### Reference Announcements
+

+
When an update to a project is made by a user, the user's node sends a message
+
to the network, announcing the update. Nodes that are tracking this project are
+
then able to fetch the updates via the *git* protocol, either directly from the
+
user's node, or from an intermediary node. This *refs announcement* message
+
looks like this:
+

+
    RefsAnnouncement = (
+
        NodeId,
+
        ProjectId,
+
        Map<RefName, CommitId>,
+
        Signature,
+
    )
+

+
It contains the identifier of the node announcing the updated references, the
+
project under which these refs reside, the map of reference names (`RefName`)
+
with their new commit hashes (`CommitId`) and a signature from the publishing
+
node (`NodeId`), over the refs and project identifier.
+

+
This allows any receiving node tracking the project to verify the legitimacy
+
of the message using `NodeId` and `Signature`. For new projects, published
+
on the network for the first time, the same type of message can be used.
+

+
Reference announcements, unlike inventory announcement should only be
+
relayed to interested nodes, ie. nodes that are hosting the given project,
+
as they will usually be followed by a `git-fetch`.
+

+
We should also note that the `NodeId` in this case is not only the announcer
+
of these updated references, but the *author*. When Alice pushes changes to
+
a project, she announces these changes over the network using a reference
+
announcement, via her own node.
+

+
### Node Announcements
+

+
We've touched upon inventory gossip, and how project metadata and data is
+
exchanged, but not how node metadata is exchanged; or in other words, how *peer
+
discovery* is carried out. For this purpose, we devise a *node announcement*
+
message:
+

+
    NodeAnnouncement = (
+
        NodeId,
+
        NodeFeatures,
+
        Vec<Address>,
+
        Timestamp,
+
        Signature,
+
    )
+

+
This message is designed to be authored by a node announcing *itself* on the
+
network, and therefore contains a signature and timestamp, and is meant to be
+
relayed by other nodes on the network. The key "payload" of this message is
+
the vector of addresses sent by the node. This should contain all addresses
+
on which the node is publicly reachable. At minimum, this should contain
+
one IP address, but in the future could contain `.onion` addresses or DNS
+
names.
+

+
As with the inventory message, nodes should buffer these announcements to serve
+
them to new nodes connecting to the network. In addition to the list of
+
addresses, we propose to also include a list of features supported by the
+
announcing node, to allow for future protocol upgrades.
+

+
#### Bootstrap Nodes
+

+
A node joining the network for the first time will not know of any peers.
+
Hence, it's advised that network client software be pre-configured with
+
DNS "seeds". These are registered DNS names, eg. `seeds.radicle.xyz` that
+
resolve to node addresses on the network. In the bootstrapping process,
+
nodes can resolve these names to have a set of addresses to initially
+
connect to, and once they find a peer, use the regular peer discovery
+
process to find more nodes.
+

+
Replication
+
-----------
+
While gossip is used to exchange *metadata*, the actual repository *data*, ie.
+
Git objects are transferred via the process of *replication*.
+

+
Nodes are configured with a list of projects that they are meant to host. These
+
are called *tracked* projects, and this configuration is called the *tracking
+
policy*.
+

+
When a new node joins the network, the first thing it will attempt to do is to
+
retrieve these tracked projects from the network. This is called *bootstrapping*.
+
To do this, the node consults its routing table, locates the project's *seeds*,
+
and initiates a `git-fetch` via the *git* protocol, with one or more seed. This
+
fetch operation downloads the relevant git objects into the node's *storage*,
+
making them available to other interested nodes.
+

+
To notify its peers that its inventory has changed, it sends an *inventory*
+
message to each of its peers. Replication is only possible because of the
+
exchange of information on the gossip layer. Without it, nodes wouldn't
+
know where to replicate projects from, and would quickly fall behind.
+

+
### Project Tracking and Branches
+

+
While it's possible to always replicate and track at the *repository* level,
+
it is highly impractical: such an *open* tracking policy is easily abused by
+
malicious nodes. Given limited disk space and bandwidth, nodes need a way to
+
replicate only a *subset* of repository data, authored by users they can
+
trust.
+

+
If we want to allow contributors to publish code that is intended to be merged
+
into another project, we need to think about a node's tracking policy with
+
respect to individual contributors and their git reference *trees*
+
(the set of all branches published under a project by a given contributor).
+
The safest tracking policy is to only track trees published by the delegates of
+
a tracked project. But this means that contributors (non-delegates) won't have
+
their changes replicated on the network. This becomes a problem when a
+
contributor wants to propose a new patch to a project: unless the contributor
+
is online to *seed* that branch, there is no way for the maintainer to retrieve
+
it.
+

+
Lacking a good answer to this problem at the protocol level, we defer to node
+
operators to implement policies at the individual seed level. For example, a
+
seed node may require of users to link their social profiles with their Radicle
+
key before enabling tracking for their branches.
+

+
In the past, we've explored the idea of a "social" tracking graph. We leave
+
this door open for potential future iterations of the protocol.
+

+
### Unintentional Forks and Conflicts
+

+
It is possible, through a software bug or user error to unintentionally fork
+
one's history. For example, a user publishes changes to a branch that land
+
on a node in the network, but later re-writes that branch's history and
+
re-publishes it. Nodes that received the initial changes will not be able
+
to merge these new changes via a simple history *fast-forward*, while nodes
+
that never saw the initial changes will have no problem fetching the new ones.
+

+
    User       Seed #1       Seed #2
+

+
    B                          B
+
    |  A            A          |
+
    | /            /           |
+
    |/            /            |
+
    |            |             |
+

+
In the above diagram, a user publishes history `A` under branch `master`,
+
which lands on `Seed #1` and then publishes history `B` under the same branch.
+
`Seed #2`, which didn't have the prior history is able to fetch that branch
+
without problems. However, `Seed #1` isn't, since the histories are divergent.
+

+
Now, depending on which seed is used to fetch, a user would get a completely
+
different history for that branch.
+

+
To solve this problem, we have to realize one simple thing: every time a user
+
publishes new code to the network, a commit in the *signed refs* history is
+
created. This means that after publishing `B`, the user's signed refs history
+
will look like this:
+

+
    refs/…/heads/master    B
+
    refs/…/heads/master    A
+
+

+
Since `Seed #1` will have this history if it fetches `B`, it will be able to
+
simply set the head of the branch to the latest signed ref for that branch,
+
which is `B`.
+

+
Forks in the signed refs history itself, though much more problematic, could
+
be recovered by picking the history with the latest timestamp.
+

+
Storage
+
-------
+
Since storage and replication are tightly coupled, and replication makes use of
+
Git, so does storage. Unlike previous versions of Radicle, each project is
+
stored in its own *bare* Git repository, under a common base directory.
+

+
Storage is accessed directly by the node to report inventory to other nodes,
+
and accessed by the end user through either specialized tooling or `git`.
+

+
It's important to know that a user working on a project will typically have
+
*two* copies of the repository: one in storage, called the *stored* copy,
+
and one *working copy*. The working copy will be setup in such a way that it is
+
linked to storage via a *remote* named `rad`. Publishing code is then a matter
+
of running for eg. `git push rad`. Code in storage can be considered *public*,
+
since it is shared with connected peers, while code in the working copy is
+
considered *private* until pushed.
+

+
This allows for code to be staged for publishing even while offline, provided
+
the user's storage is accessible. In most cases, it makes sense to keep
+
storage and working copies on the same machine.
+

+
    ┌─────────────────────────────────────┐          ┌───────────────────────────────────┐
+
    │ ┌────────────────────────┐ ┌──────┐ │          │ ┌──────┐ ┌──────────────────────┐ │
+
    │ │ Storage                │ │      │ │   git    │ │      │ │ Storage              │ │
+
    │ │                        ├─┤------├─┼─--------─┼─┤------├─┤                      │ │
+
    │ │ ┌───────┐  ┌───────┐  ┌│ │      │ │  fetch   │ │      │ │ ┌───────┐  ┌───────┐ │ │
+
    │ │ │Project│  │Project│  ││ │      │ │          │ │      │ │ │Project│  │Project│ │ │
+
    │ │ ├───────┤  ├───────┤  ├│ │      │ │          │ │      │ │ ├───────┤  ├───────┤ │ │
+
    │ └─┴────▲──┴──┴────┬──┴──┴┘ │      │ │          │ │      │ └─┴────┬──┴──┴────▲──┴─┘ │
+
    │        │          │        │      │ │  gossip  │ │      │        │          │      │
+
    │        │          │        │ node ├─┼─--------─┼─┤ node │        │          │      │
+
    │        │          │        │      │ │ protocol │ │      │        │          │      │
+
    │       push       pull      │      │ │          │ │      │       pull       push    │
+
    │        │          │        │      │ │          │ │      │        │          │      │
+
    │        │          │        │      │ │          │ │      │        │          │      │
+
    │        │          │        │      │ │          │ │      │        │          │      │
+
    │   ┌────┴───┐  ┌───▼────┐   │      │ │          │ │      │   ┌────▼───┐  ┌───┴────┐ │
+
    │   │Working │  │Working │   │      │ │          │ │      │   │Working │  │Working │ │
+
    │   │copy    │  │copy    │   │      │ │          │ │      │   │copy    │  │copy    │ │
+
    │   └────────┘  └────────┘   └──────┘ │          │ └──────┘   └────────┘  └────────┘ │
+
    └─────────────────────────────────────┘          └───────────────────────────────────┘
+

+
Splitting the storage into per-project repositories has numerous advantages
+
over previous designs that used a "monorepo":
+

+
* Private repositories become possible to offer
+
* Concurrency is simpler, since we can have locks at the project level
+
* The Git object database is used more efficiently, since there is more sharing
+
* Repository settings can be tuned on a per-project level
+

+
### Layout
+

+
Since nodes replicate Git data from other nodes, a partitioning scheme is
+
needed to separate references belonging to each node within each project.
+
This can be achieved by using a node's identifier to namespace git references,
+
since references are by their nature hierarchical, eg.
+
`refs/remotes/alice/heads/master`.
+

+
#### Special References
+

+
To store project metadata, special git references are used. These are references
+
that are written to a known location that doesn't vary between projects, and in
+
some cases is meant to be hidden from the user, and accessed only through
+
purpose-built tooling.
+

+
The first one is the head of the project identity branch:
+

+
    refs/rad/id
+

+
This reference points to the latest version of the identity document. Users
+
who want to make changes to their project's identity document can checkout
+
this branch.
+

+
The second one is the *signed refs*:
+

+
    refs/rad/sigrefs
+

+
You'll notice these references are not under the `heads/` hierarchy and therefore
+
aren't regular branches. These references point to commit histories, though
+
these histories are disjoint from the source code's history. Since they
+
determine the outcome of project verification, they need to be handled with
+
care and are not intended to be accessed directly by an end user.
+

+
Storage layout will be specified in more detail in future work.
+

+
Canonicity
+
----------
+
In the above examples, we limited ourselves to projects with a single delegate.
+
For the majority of open source projects, this is fitting: a single maintainer
+
is in charge of the code. However, larger projects often have more than one
+
user with push access or commit rights. In the Radicle model, there are no
+
shared branches: each branch or set of branches under a tree is owned by
+
*one* key. This is why they are partitioned by a public key in the repository
+
hierarchy.
+

+
In a project with multiple delegates, for example, Alice, Bob and Eve, each
+
would have their *own* `master` branch which only they could write to, eg.:
+

+
    alice/refs/heads/master
+
    bob/refs/heads/master
+
    eve/refs/heads/master
+

+
So how does a contributor know which of those branches is the canonical one?
+
The protocol itself does not have a notion of canonicity. It is left up to
+
social consensus: perhaps the three delegates have agreed that Eve's `master`
+
branch will be the canonical one that everyone should pull from. This agreement
+
can be encoded in the project's identity document, and leveraged by tooling,
+
but it is entirely optional, as it may not be desirable for all projects to
+
"elect" a delegate that way.
+

+
For projects that do not have a canonical `master`, there is another option
+
to establish consensus on the state of a repository: *quorum*. Simply put,
+
we can examine the commit histories of all three `master` branches, and
+
pick the *latest* commit included in a *majority* of histories as our canonical
+
state. For example, in the example below, the commit referred to by `B`, that
+
is Bob's latest would be used.
+

+
    Alice       Bob       Eve
+

+
    C o                    D o
+
      |                     /
+
    B o        B o       B o <- quorum
+
      |          |         |
+
    A o        A o       A o
+
      |          |         |
+

+

+
Closing Thoughts
+
----------------
+
The protocols described above are part of a larger system. It's our hope that
+
subsequent RIPs will answer some of the questions that arise from this proposal,
+
as well as flesh out the functionality on top of the base protocol. How to
+
represent rich social interactions and collaboration such as code review,
+
comments, issues and patches on top of this protocol, is left for future
+
discussion.
+

+
Credits
+
-------
+
* Kim Altintop, for coming up with the original design this protocol is based on
+
* Alex Good, for helping iterate on some of the ideas found in this proposal
+
* Fintan Halpenny, for helping iterate on some of the ideas found in this proposal
+

+
Copyright
+
---------
+
This document is licensed under the Creative Commons CC0 1.0 Universal license.