| |
of the protocol, in particular around the issues of privacy, censorship
|
| |
resistance, peer misbehavior and DoS are left for future RIPs to expand on,
|
| |
and won't be tackled here. Additionally, details and specifics on the wire
|
| - |
protocol, message formats, hash functions and encodings are deliberately
|
| - |
left out, to focus this proposal on the big ideas.
|
| - |
|
| - |
> This document requires little to no prior knowledge of existing Radicle
|
| - |
protocols; it is written with the intent of being complete and self-contained.
|
| + |
protocol, message formats, hash functions and encodings may be left out,
|
| + |
to focus this proposal on the big ideas.
|
| |
|
| |
Overview
|
| |
--------
|
| - |
The Radicle network protocol can be defined through the primary use-case of a
|
| + |
The Radicle network protocol can be defined through the intended use-case for a
|
| |
peer-to-peer code hosting network:
|
| |
|
| |
*Alice publishes a repository on the network under a unique identifier, and
|
| - |
Bob, using that identifier is able to retrieve it and verify its authenticity*.
|
| + |
Bob, using that identifier is able to retrieve it from the network whether
|
| + |
Alice is online or offline, and verify the authenticity of the data.*
|
| |
|
| |
The above must hold true independent of the network topology and number of
|
| - |
nodes hosting the project, as long as at least one node is. We can therefore
|
| - |
say that the primary function of the protocol is to locate repositories on
|
| - |
the network, and serve them to users, all in a timely, and resource-efficient
|
| + |
nodes hosting the project, as long as at least one node is hosting it. We can
|
| + |
therefore say that the primary function of the protocol is to locate repositories
|
| + |
on the network, and serve them to users, all in a timely, and resource-efficient
|
| |
manner.
|
| |
|
| |
This functionality can be broken down into three components:
|
| |
is their mandate to decide what gets merged into a codebase.
|
| |
|
| |
We can then define a repository's identity as the set of all branches and tags
|
| - |
that the maintainers of the repository agreed upon.
|
| + |
that the maintainers of the repository agreed upon at a given point in time,
|
| + |
along with a unique identifier.
|
| |
|
| - |
For anyone to be able to verify this, we require maintainers to provide a
|
| + |
For anyone to be able to verify an identity, we require maintainers to provide a
|
| |
cryptographic signature over the repository's heads, tags, and other relevant
|
| - |
git references. We call this the *signed refs*. Signed refs can be updated
|
| - |
whenever there are changes to a repository that are accepted by maintainers.
|
| + |
git references, along with repository metadata such as name and description.
|
| + |
We call this the *signed refs*. Signed refs can be updated whenever there are
|
| + |
changes to a repository that are accepted by maintainers. They represent a
|
| + |
repository's *canonical state*.
|
| + |
|
| + |
As for the identifier, it must be provably associated with the above state.
|
| + |
In other words, it must be possible, given an identifier and a set of signed
|
| + |
refs, to prove association.
|
| |
|
| - |
The last question to answer is *who* are the maintainers, and how are they
|
| - |
determined?
|
| + |
### The Identity Document
|
| |
|
| - |
Before a repository can be published to the network, it needs to be converted
|
| + |
Before a repository can be published to the network, it needs to be initialized
|
| |
into a Radicle *project*. A project is simply a repository with an associated
|
| |
*identity document*. In this document, the public keys of the repository's
|
| |
current maintainers are stored. When a project is initialized from an existing
|
| |
git repository for the first time, the user initializing becomes the de-facto
|
| |
initial maintainer of the project, and his key is included in the new identity
|
| - |
document. We call this set of trusted keys the project's *delegation*, and each
|
| - |
key is called a *delegate*. Though these will often map one to one with
|
| + |
document's key ring. We call this set of trusted keys the project's *delegation*,
|
| + |
and each key is called a *delegate*. Though these will often map one to one with
|
| |
maintainers, this is not a requirement. The only requirement is that they be
|
| - |
trusted in the context of a given project, as they will be used to determine
|
| - |
the canonical state of the project's repository.
|
| + |
trusted to represent a given project, as they will be used to determine
|
| + |
the canonical state of the project repository.
|
| |
|
| |
From this initial identity document we can then derive a unique, stable
|
| - |
identifier for the project, by hashing the document's contents. To ensure the
|
| - |
uniqueness of this identifier, in addition to the *delegation*, we include in
|
| - |
the document a user-chosen *alias* and *description* for the project.
|
| + |
identifier for the project, by hashing the document's contents. In addition to
|
| + |
the *delegation*, we include in the document a user-chosen *alias* for the
|
| + |
project, as well as a *description*. The process for hashing the document shall
|
| + |
be discussed in a subsequent RIP.
|
| |
|
| |
It is by including the identity document in the *signed refs* that we establish
|
| |
a relationship between the source code and the identity, and thus associate
|
| |
the project identifier with the project source code. Note that this permits
|
| |
identical source codes to have more than one identity. This is useful when
|
| |
a user wishes to a *fork* a repository. In that case, a new project would
|
| - |
be initialized with a brand new identifier.
|
| + |
be initialized with a brand new identifier, but a mostly identical source code
|
| + |
history.
|
| |
|
| |
The storage, update and verification mechanism for the identity document
|
| |
will be discussed in more detail in a future RIP. For the purposes of this
|
| |
Routing information is usually stored in a *routing table* that is keyed
|
| |
by the "target", in our case this is the project identifier:
|
| |
|
| - |
RoutingTable = BTreeMap<ProjectId, Vec<NodeId>>
|
| + |
RoutingTable = BTreeMap<RepositoryId, Vec<NodeId>>
|
| |
|
| |
For each project, we keep track of the nodes that are known to host this
|
| - |
project. Using URNs for project identifiers and IP addresses as node
|
| + |
project. Using hashes for project identifiers and IP addresses as node
|
| |
identifiers, the table might look something like:
|
| |
|
| - |
rad:jnrx6t…mfdyy 54.122.99.1, 89.2.23.67
|
| - |
rad:gi59bk…l2jfi 66.12.193.8, 89.2.23.67, 12.43.212.9
|
| + |
80a2970… 54.122.99.1, 89.2.23.67
|
| + |
c5e079e… 66.12.193.8, 89.2.23.67, 12.43.212.9
|
| |
…
|
| |
|
| |
To build the table, nodes gossip information about other nodes, namely *what*
|
| |
operating the node. To be able to securely verify data authorship, we use
|
| |
public key cryptography, with the public key being used as the node identifier.
|
| |
In the case of nodes run by end-users; which is likely most nodes; the
|
| - |
node's secret key is used for *signed refs* and optionally to sign git commits.
|
| + |
node's secret key is used to create the *signed refs* and optionally to
|
| + |
sign git commits.
|
| |
|
| |
The use of the same identity for both network communications and code signing
|
| - |
makes the network more transparent, while allowing nodes to connect to each
|
| - |
other based on a social graph.
|
| + |
makes the network more transparent, while allowing nodes to trust each other
|
| + |
based on the code they publish.
|
| |
|
| |
For nodes that are run as always-on "servers", the node identity may not be
|
| |
used for signing code. These *seed* nodes only use their secret keys to
|
| |
### Inventory Announcements
|
| |
|
| |
To build their routing table, nodes connecting to the network announce to their
|
| - |
peers what inventory they have, ie. what projects they are seeding via a *gossip*
|
| - |
protocol. These announcements are relayed to other connected peers, and so
|
| + |
peers what inventory they have, ie. what projects they are seeding.
|
| + |
These announcements are relayed to other connected peers, and so
|
| |
on until they reach the majority of nodes on the network. Messages that have
|
| |
already been seen are dropped, to prevent messages from propagating forever.
|
| |
Gossip messages may be retained by nodes for a certain amount of time, so that
|
| - |
they can be served to new nodes joining the network. This *inventory
|
| + |
they can be served to new nodes joining the network. The *inventory
|
| |
announcement* message has the following shape:
|
| |
|
| |
InventoryAnnouncement = (
|
| |
NodeId,
|
| - |
Vec<ProjectId>,
|
| + |
Vec<RepositoryId>,
|
| |
Timestamp,
|
| |
Signature,
|
| |
)
|
| |
|
| |
### Layout
|
| |
|
| + |
Each project under radicle is stored in a bare Git repository containing
|
| + |
the local user's refs, as well as the refs of all of the peers that the
|
| + |
user's node is configured to track. A simple "namespacing" scheme can be used,
|
| + |
similar to the one used by git for remote tracking branches.
|
| + |
|
| |
Since nodes replicate Git data from other nodes, a partitioning scheme is
|
| |
needed to separate references belonging to each node within each project.
|
| - |
This can be achieved by using a node's identifier to namespace git references,
|
| - |
since references are by their nature hierarchical, eg.
|
| - |
`refs/remotes/alice/heads/master`.
|
| + |
This can be achieved by using a node's unique identifier (its public key)
|
| + |
to namespace Git references, since references are by their nature hierarchical.
|
| |
|
| |
#### Special References
|
| |
|
| - |
To store project metadata, special git references are used. These are references
|
| + |
To store project metadata, special Git references are used. These are references
|
| |
that are written to a known location that doesn't vary between projects, and in
|
| |
some cases is meant to be hidden from the user, and accessed only through
|
| |
purpose-built tooling.
|