Radish alpha
h
rad:z3gqcJUoA1n9HaHKufZs5FCSGazv5
Radicle Heartwood Protocol & Stack
Radicle
Git
signals: Use `signals_receipts` crate instead
Merged did:key:z6Mko5mi...bgMH opened 1 year ago

This fixes the following issues:

  • 111cf90 (handler is not async-signal-safe)
  • c8e0300 (libc::signal should not be used)
  • b91c6c3 (Mutex can cause signals to be missed)

The changes are all only internal to radicle-signals, and the API of radicle-signals is the same as before, and uses of it don’t need to be and aren’t changed.

See the commit’s message that describes its changes. See the comments in the new code. See the below:

All those issues are fixed by a different internal approach that:

  • Uses only atomic counters and a semaphore from within the signal handlers, ensuring async-signal-safety portably.
  • Uses libc::sigaction instead. With SA_RESTART, which preserves the same behavior as the prior use of libc::signal on all the OSs that https://radicle.xyz/ provides prebuilt binaries for, and which now ensures consistent behavior across all further OSs.
  • Does not access the mutex from within the signal handlers, and separates the management of the mutex’ed global state from the processing of signals receipts that sends the notifications over a channel.
  • Has an additional internal thread that blocks on the semaphore, and, upon wakes, takes the counters’ latest values, and, for those signals that have newly been delivered, sends notifications over the user-provided channel.

All of this approach is provided via a new external dependency, the signals_receipts crate, written by myself, that provides and manages it all in a very easy-to-use way. The justification for having a new external dependency is:

It was found that all the parts for this approach are better provided as libraries (crates), because: that significantly shortens and simplifies radicle-signals; they are entirely reusable for other various projects; their internal workings need to be encapsulated; and they’re not something that the Radicle team should be expected to take on the maintenance of. I, as the maintainer, expect the maintenance burden to be minimal and rare and my understanding to not fade too much, and I expect that any team member could reasonably familiarize with it all if needed, due to the medium-small size, limited, and already-complete nature of these crates and due to the extensive comments in them.

It was also found that no lesser approach could be truly robust and portable while still preserving the preexisting API and use cases of radicle-signals. The approach I settled on does achieve that and does preserve those.

Further information about the approach can be found in the crates’ READMEs and documentation: https://crates.io/crates/signals_receipts https://crates.io/crates/sem_safe

To my knowledge, the only alternative for comparable robustness and portability would be the signal_hook crate which has more use in the Rust ecosystem. After evaluating it, I decided against it because:

  1. It can’t fully uninstall the signal handlers at the OS-process level (it can only emulate that (without perfect fidelity, IIUC)), and not being able to do so would’ve been a more disruptive change to the preexisting, and perhaps future, uses of radicle-signals that have been doing, and might want to do, full uninstalling. Whereas, signals_receipts does fully uninstall.
  2. It doesn’t provide the ability to send over channels. Achieving that would’ve required an additional internal thread to do such, but having an ad-hoc approach to managing that, with the uninstalling and re-installing, would’ve been error-prone. Whereas, signals_receipts does provide the ability to send over channels, and it automatically manages its internal thread robustly.
  3. It has other substantial features that aren’t needed by radicle-signals, but those would’ve had to be carried in the object code just to use its iterator that builds on them. Whereas, signals_receipts carries only what’s needed.
  4. It uses the “self-pipe trick” which is a hack (albeit a reliable one). Whereas, signals_receipts uses semaphores which are more modern and are intended for signal handling.

I’ve confirmed this patch builds cleanly for all the currently supported platforms: x86_64-unknown-linux-musl aarch64-unknown-linux-musl x86_64-apple-darwin aarch64-apple-darwin As well as for: x86_64-unknown-linux-gnu

I have tested this patch in these ways:

  • The dependency crates, signals_receipts & sem_safe, that provide the new approach, have thorough tests and have been confirmed by me to work correctly on numerous POSIX OSs (see their READMEs). (Their portability is broader than Radicle’s current support, and so will assist in broadening its support.) I’ve also done thorough debugger stepping of their code paths.
  • I’ve conducted functional testing of the direct uses of radicle-signals, using the new approach of this patch, on Linux and macOS, x86-64.
  • I’ve done thorough debugger stepping of the code paths of the direct uses of radicle-signals, including of the various possibilities due to differences in thread-scheduling races.
  • I’ve spent considerable time reasoning about the dependency crates and the preexisting uses of radicle-signals.

The uses of radicle-signals that I tested are:

  • radicle-node: Simple use in main.rs & runtime.rs that only installs once with a bounded channel and never uninstalls.
  • radicle-term: Uses in spinner.rs & pager.rs that uninstall after installing with unbounded channels and that could be invoked multiple times within the same process and thus could do re-installing.
  • rad-cli-demo: Uses radicle-term spinner & pager.

The indirect uses of radicle-signals that I haven’t tested yet are:

  • radicle-cli: Uses radicle-term spinner & pager in numerous places that seemed equivalent enough to what I have tested.

The uses should be tested on AArch64 for at least the OSs that Radicle supports (Linux & macOS), which I haven’t done yet. There’s no reason to expect any issues with that, since OSs provide the same APIs and semantics for the relevant aspects of signal handling regardless of CPU architecture. I’d need to find access to such machines.

I can test the remaining uses, if there’s interest in accepting this contribution. Or, it might be good for someone from the team to test those as part of reviewing this.

If you have any questions, I’m happy to respond to them.

did:key:z6Mko5mi...bgMH opened with revision 8e057028 on base e130b4dc +126 -82 1 year ago

This fixes the following issues:

  • 111cf90 (handler is not async-signal-safe)
  • c8e0300 (libc::signal should not be used)
  • b91c6c3 (Mutex can cause signals to be missed)

The changes are all only internal to radicle-signals, and the API of radicle-signals is the same as before, and uses of it don’t need to be and aren’t changed.

See the commit’s message that describes its changes. See the comments in the new code. See the below:

All those issues are fixed by a different internal approach that:

  • Uses only atomic counters and a semaphore from within the signal handlers, ensuring async-signal-safety portably.
  • Uses libc::sigaction instead. With SA_RESTART, which preserves the same behavior as the prior use of libc::signal on all the OSs that https://radicle.xyz/ provides prebuilt binaries for, and which now ensures consistent behavior across all further OSs.
  • Does not access the mutex from within the signal handlers, and separates the management of the mutex’ed global state from the processing of signals receipts that sends the notifications over a channel.
  • Has an additional internal thread that blocks on the semaphore, and, upon wakes, takes the counters’ latest values, and, for those signals that have newly been delivered, sends notifications over the user-provided channel.

All of this approach is provided via a new external dependency, the signals_receipts crate, written by myself, that provides and manages it all in a very easy-to-use way. The justification for having a new external dependency is:

It was found that all the parts for this approach are better provided as libraries (crates), because: that significantly shortens and simplifies radicle-signals; they are entirely reusable for other various projects; their internal workings need to be encapsulated; and they’re not something that the Radicle team should be expected to take on the maintenance of. I, as the maintainer, expect the maintenance burden to be minimal and rare and my understanding to not fade too much, and I expect that any team member could reasonably familiarize with it all if needed, due to the medium-small size, limited, and already-complete nature of these crates and due to the extensive comments in them.

It was also found that no lesser approach could be truly robust and portable while still preserving the preexisting API and use cases of radicle-signals. The approach I settled on does achieve that and does preserve those.

Further information about the approach can be found in the crates’ READMEs and documentation: https://crates.io/crates/signals_receipts https://crates.io/crates/sem_safe

To my knowledge, the only alternative for comparable robustness and portability would be the signal_hook crate which has more use in the Rust ecosystem. After evaluating it, I decided against it because:

  1. It can’t fully uninstall the signal handlers at the OS-process level (it can only emulate that (without perfect fidelity, IIUC)), and not being able to do so would’ve been a more disruptive change to the preexisting, and perhaps future, uses of radicle-signals that have been doing, and might want to do, full uninstalling. Whereas, signals_receipts does fully uninstall.
  2. It doesn’t provide the ability to send over channels. Achieving that would’ve required an additional internal thread to do such, but having an ad-hoc approach to managing that, with the uninstalling and re-installing, would’ve been error-prone. Whereas, signals_receipts does provide the ability to send over channels, and it automatically manages its internal thread robustly.
  3. It has other substantial features that aren’t needed by radicle-signals, but those would’ve had to be carried in the object code just to use its iterator that builds on them. Whereas, signals_receipts carries only what’s needed.
  4. It uses the “self-pipe trick” which is a hack (albeit a reliable one). Whereas, signals_receipts uses semaphores which are more modern and are intended for signal handling.

I’ve confirmed this patch builds cleanly for all the currently supported platforms: x86_64-unknown-linux-musl aarch64-unknown-linux-musl x86_64-apple-darwin aarch64-apple-darwin As well as for: x86_64-unknown-linux-gnu

I have tested this patch in these ways:

  • The dependency crates, signals_receipts & sem_safe, that provide the new approach, have thorough tests and have been confirmed by me to work correctly on numerous POSIX OSs (see their READMEs). (Their portability is broader than Radicle’s current support, and so will assist in broadening its support.) I’ve also done thorough debugger stepping of their code paths.
  • I’ve conducted functional testing of the direct uses of radicle-signals, using the new approach of this patch, on Linux and macOS, x86-64.
  • I’ve done thorough debugger stepping of the code paths of the direct uses of radicle-signals, including of the various possibilities due to differences in thread-scheduling races.
  • I’ve spent considerable time reasoning about the dependency crates and the preexisting uses of radicle-signals.

The uses of radicle-signals that I tested are:

  • radicle-node: Simple use in main.rs & runtime.rs that only installs once with a bounded channel and never uninstalls.
  • radicle-term: Uses in spinner.rs & pager.rs that uninstall after installing with unbounded channels and that could be invoked multiple times within the same process and thus could do re-installing.
  • rad-cli-demo: Uses radicle-term spinner & pager.

The indirect uses of radicle-signals that I haven’t tested yet are:

  • radicle-cli: Uses radicle-term spinner & pager in numerous places that seemed equivalent enough to what I have tested.

The uses should be tested on AArch64 for at least the OSs that Radicle supports (Linux & macOS), which I haven’t done yet. There’s no reason to expect any issues with that, since OSs provide the same APIs and semantics for the relevant aspects of signal handling regardless of CPU architecture. I’d need to find access to such machines.

I can test the remaining uses, if there’s interest in accepting this contribution. Or, it might be good for someone from the team to test those as part of reviewing this.

If you have any questions, I’m happy to respond to them.

did:key:z6Mko5mi...bgMH pushed revision 2 53c799a7 on base e130b4dc +126 -82 1 year ago
did:key:z6Mko5mi...bgMH commented on revision 2 1 year ago

I noticed (in Zulip) ongoing interest in supporting Windows, so I’d like to comment about how this patch relates to that:

The signals_receipts crate doesn’t (and shouldn’t) support Windows, and so neither does this patch, but this doesn’t worsen the previous situation with radicle-signals.

I considered this acceptable because, for Radicle to provide the same high-level functionality (termination of processes, interrupting processing, escaping TUI elements, and adjusting to terminal-size changes, etc) for Windows, it already was necessary anyway to have a separate implementation that uses Windows’ different facilities, since Windows simply does not have asynchronously-generated signals of the POSIX kind and so any implementation that uses POSIX signals is unusable for Windows.

If/when Windows support is added to radicle-signals, it was already going to be necessary to conditionally not use the POSIX-signals stuff that this patch deals with.

fintohaps accepted 1 year ago

Thanks for the thorough walkthrough of the changes and your testing. This looks good to me! I would like one other person to review it to ensure it’s all good.

did:key:z6MkgFq6...nBGz accepted 1 year ago

LGTM! Thanks for all the work :)

lorenz pushed revision 3 0c5e9f12 on base c05434eb +126 -82 1 year ago

Rebase.

lorenz pushed revision 4 5d469e83 on base c05434eb +126 -82 1 year ago

Rebase and sign.

lorenz merged revision 5d469e83 at 47c785b9 1 year ago