Radish alpha
h
rad:z3gqcJUoA1n9HaHKufZs5FCSGazv5
Radicle Heartwood Protocol & Stack
Radicle
Git
protocol: Revisit encoding and decoding
Merged lorenz opened 8 months ago

Encoding

In 3c5668edd22ae4b9a085220d6be552f944ccb038 I mistakenly moved the check the length of an encoded wire::Message from its impl wire::Encode to the fn serialize, so that it would not apply only to messages (as originally intended), but to any argument of serialize.

This is not a problem for gossip messages, but catastrophic for Git streams, which tend to send a lot of data.

The fix is simple: Move the check back into impl Encode for Message.

What made this harder to spot for me was the multitude of concepts and terms involved in encoding: What’s serializing vs. encoding? Why is there this odd Frame::to_bytes?

So I decided to clean up a bit. I removed fn serialize so that the logic stays close to the data, and instead of it provide fn encode_vec in trait Encode so that we don’t add up with one-offs like Frame::to_bytes again. Also we make use of encode_vec in the tests.

There is test_encode_git_large will panic if we ever regress.

Decoding

Remove fn deserialize and instead provide a trait fn decode_exact for testing.

Refactor errors, so that we can now cleanly distinguish between Error::UnexpectedEnd which means that decoding might work given more data, and Error::Invalid, which means that no amount of additional data will decode.

While at it, cleanup the variants of Error::Invalid, and provide ControlType.

lorenz opened with revision 2360161e on base 19a262d3 +322 -259 8 months ago

Encoding

In 3c5668edd22ae4b9a085220d6be552f944ccb038 I mistakenly moved the check the length of an encoded wire::Message from its impl wire::Encode to the fn serialize, so that it would not apply only to messages (as originally intended), but to any argument of serialize.

This is not a problem for gossip messages, but catastrophic for Git streams, which tend to send a lot of data.

The fix is simple: Move the check back into impl Encode for Message.

What made this harder to spot for me was the multitude of concepts and terms involved in encoding: What’s serializing vs. encoding? Why is there this odd Frame::to_bytes?

So I decided to clean up a bit. I removed fn serialize so that the logic stays close to the data, and instead of it provide fn encode_vec in trait Encode so that we don’t add up with one-offs like Frame::to_bytes again. Also we make use of encode_vec in the tests.

There is test_encode_git_large will panic if we ever regress.

Decoding

Remove fn deserialize and instead provide a trait fn decode_exact for testing.

Refactor errors, so that we can now cleanly distinguish between Error::UnexpectedEnd which means that decoding might work given more data, and Error::Invalid, which means that no amount of additional data will decode.

While at it, cleanup the variants of Error::Invalid, and provide ControlType.

lorenz pushed revision 2 2eea26db on base 19a262d3 +273 -222 8 months ago

Pull out the actual fix into a separate commit for clarity.

lorenz pushed revision 3 d46f35a1 on base a4d83ec8 +273 -222 8 months ago

Rebase

lorenz pushed revision 4 68ef16c2 on base a4d83ec8 +273 -222 8 months ago

Review with Fintan.

fintohaps pushed revision 5 44e41158 on base c5b99db1 +279 -220 8 months ago

Changes:

  • Rebased
  • Rename encode_vec -> encode_to_vec
  • Fix doc string
fintohaps merged revision 44e41158 at a568e7f4 8 months ago